TY - JOUR
T1 - General retinal image enhancement via reconstruction
T2 - Bridging distribution shifts using latent diffusion adaptors
AU - Yang, Bingyu
AU - Han, Haonan
AU - Zhang, Weihang
AU - Li, Huiqi
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/7
Y1 - 2025/7
N2 - Deep learning-based fundus image enhancement has attracted extensive research attention recently, which has shown remarkable effectiveness in improving the visibility of low-quality images. However, these methods are often constrained to specific datasets and degradations, leading to poor generalization capabilities and having challenges in the fine-tuning process. Therefore, a general method for fundus image enhancement is proposed for improved generalizability and flexibility, which decomposes the enhancement task into reconstruction and adaptation phases. In the reconstruction phase, self-supervised training with unpaired data is employed, allowing the utilization of extensive public datasets to improve the generalizability of the model. During the adaptation phase, the model is fine-tuned according to the target datasets and their degradations, utilizing the pre-trained weights from the reconstruction. The proposed method improves the feasibility of latent diffusion models for retinal image enhancement. Adaptation loss and enhancement adaptor are proposed in autoencoders and diffusion networks for fewer paired training data, fewer trainable parameters, and faster convergence compared with training from scratch. The proposed method can be easily fine-tuned and experiments demonstrate the adaptability for different datasets and degradations. Additionally, the reconstruction-adaptation framework can be utilized in different backbones and other modalities, which shows its generality.
AB - Deep learning-based fundus image enhancement has attracted extensive research attention recently, which has shown remarkable effectiveness in improving the visibility of low-quality images. However, these methods are often constrained to specific datasets and degradations, leading to poor generalization capabilities and having challenges in the fine-tuning process. Therefore, a general method for fundus image enhancement is proposed for improved generalizability and flexibility, which decomposes the enhancement task into reconstruction and adaptation phases. In the reconstruction phase, self-supervised training with unpaired data is employed, allowing the utilization of extensive public datasets to improve the generalizability of the model. During the adaptation phase, the model is fine-tuned according to the target datasets and their degradations, utilizing the pre-trained weights from the reconstruction. The proposed method improves the feasibility of latent diffusion models for retinal image enhancement. Adaptation loss and enhancement adaptor are proposed in autoencoders and diffusion networks for fewer paired training data, fewer trainable parameters, and faster convergence compared with training from scratch. The proposed method can be easily fine-tuned and experiments demonstrate the adaptability for different datasets and degradations. Additionally, the reconstruction-adaptation framework can be utilized in different backbones and other modalities, which shows its generality.
KW - Distribution shifts
KW - Enhancement adaptor
KW - Latent diffusion models
KW - Retinal image enhancement
UR - http://www.scopus.com/inward/record.url?scp=105003750700&partnerID=8YFLogxK
U2 - 10.1016/j.media.2025.103603
DO - 10.1016/j.media.2025.103603
M3 - Article
AN - SCOPUS:105003750700
SN - 1361-8415
VL - 103
JO - Medical Image Analysis
JF - Medical Image Analysis
M1 - 103603
ER -