TY - GEN
T1 - Domain Adaptation for Semantic Segmentation of Cataract Surgical Images Based on Masked Image Consistency
AU - Zhang, Yuzhu
AU - Pan, Yijie
AU - Ou, Mingyang
AU - Gong, Guanghui
AU - Zhang, Qinhu
AU - Li, Haojin
AU - Li, Heng
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Cataract surgery is a complex procedure requiring precise execution of multiple steps. To improve the accuracy and efficiency of cataract surgeries, we present a semantic segmentation model for cataract surgery scenes. Our model leverages Unsupervised Domain Adaptation (UDA) techniques to enhance segmentation performance in clinical surgical environments, addressing challenges such as domain shift, occlusions between surgical tools and tissues, and long tail problem. The model utilizes a Teacher-Student model, where a student model is trained in the target domain with pseudo-labels generated by an Exponential Moving Average (EMA) teacher model, ensuring robust learning across domains. Additionally, we utilize a Masked Image Consistency (MIC) module to improve the model’s understanding of occluded regions by enforcing consistency between masked and unmasked predictions. To mitigate class imbalance between anatomical structures and surgical tools, we employ a maximum squares loss, enabling the model to achieve balanced learning. Our results demonstrate that the proposed model improves segmentation accuracy and robustness in cataract surgery scenarios.
AB - Cataract surgery is a complex procedure requiring precise execution of multiple steps. To improve the accuracy and efficiency of cataract surgeries, we present a semantic segmentation model for cataract surgery scenes. Our model leverages Unsupervised Domain Adaptation (UDA) techniques to enhance segmentation performance in clinical surgical environments, addressing challenges such as domain shift, occlusions between surgical tools and tissues, and long tail problem. The model utilizes a Teacher-Student model, where a student model is trained in the target domain with pseudo-labels generated by an Exponential Moving Average (EMA) teacher model, ensuring robust learning across domains. Additionally, we utilize a Masked Image Consistency (MIC) module to improve the model’s understanding of occluded regions by enforcing consistency between masked and unmasked predictions. To mitigate class imbalance between anatomical structures and surgical tools, we employ a maximum squares loss, enabling the model to achieve balanced learning. Our results demonstrate that the proposed model improves segmentation accuracy and robustness in cataract surgery scenarios.
KW - Cataract Surgery
KW - Semantic Segmentation
KW - Unsupervised Domain Adaptation
UR - http://www.scopus.com/inward/record.url?scp=86000444520&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-1907-8_24
DO - 10.1007/978-981-96-1907-8_24
M3 - Conference contribution
AN - SCOPUS:86000444520
SN - 9789819619061
T3 - Communications in Computer and Information Science
SP - 243
EP - 254
BT - Applied Intelligence - 2nd International Conference, ICAI 2024, Proceedings
A2 - Huang, De-Shuang
A2 - Chen, Wei
A2 - Zhang, Chuanlei
A2 - Pan, Yijie
A2 - Zhang, Qinhu
A2 - Kong, Xiangzeng
PB - Springer Science and Business Media Deutschland GmbH
T2 - 2nd International Conference on Applied Intelligence, ICAI 2024
Y2 - 22 November 2024 through 25 November 2024
ER -