Cross-modal attention network for retinal disease classification based on multi-modal images

Zirong Liu; Yan Hu; Zhongxi Qiu; Yanyan Niu; Dan Zhou; Xiaoling Li; Junyong Shen; Hongyang Jiang; Heng Li; Jiang Liu

doi:10.1364/BOE.516764

Cross-modal attention network for retinal disease classification based on multi-modal images

Zirong Liu, Yan Hu, Zhongxi Qiu, Yanyan Niu, Dan Zhou, Xiaoling Li, Junyong Shen, Hongyang Jiang, Heng Li, Jiang Liu^*

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Multi-modal eye disease screening improves diagnostic accuracy by providing lesion information from different sources. However, existing multi-modal automatic diagnosis methods tend to focus on the specificity of modalities and ignore the spatial correlation of images. This paper proposes a novel cross-modal retinal disease diagnosis network (CRD-Net) that digs out the relevant features from modal images aided for multiple retinal disease diagnosis. Specifically, our model introduces a cross-modal attention (CMA) module to query and adaptively pay attention to the relevant features of the lesion in the different modal images. In addition, we also propose multiple loss functions to fuse features with modality correlation and train a multi-modal retinal image classification network to achieve a more accurate diagnosis. Experimental evaluation on three publicly available datasets shows that our CRD-Net outperforms existing single-modal and multi-modal methods, demonstrating its superior performance.

源语言	英语
页（从-至）	3699-3714
页数	16
期刊	Biomedical Optics Express
卷	15
期	6
DOI	https://doi.org/10.1364/BOE.516764
出版状态	已出版 - 1 6月 2024
已对外发布	是

访问文件

10.1364/BOE.516764

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{36bc9710205b495b960ba5009d63bb22,

title = "Cross-modal attention network for retinal disease classification based on multi-modal images",

abstract = "Multi-modal eye disease screening improves diagnostic accuracy by providing lesion information from different sources. However, existing multi-modal automatic diagnosis methods tend to focus on the specificity of modalities and ignore the spatial correlation of images. This paper proposes a novel cross-modal retinal disease diagnosis network (CRD-Net) that digs out the relevant features from modal images aided for multiple retinal disease diagnosis. Specifically, our model introduces a cross-modal attention (CMA) module to query and adaptively pay attention to the relevant features of the lesion in the different modal images. In addition, we also propose multiple loss functions to fuse features with modality correlation and train a multi-modal retinal image classification network to achieve a more accurate diagnosis. Experimental evaluation on three publicly available datasets shows that our CRD-Net outperforms existing single-modal and multi-modal methods, demonstrating its superior performance.",

author = "Zirong Liu and Yan Hu and Zhongxi Qiu and Yanyan Niu and Dan Zhou and Xiaoling Li and Junyong Shen and Hongyang Jiang and Heng Li and Jiang Liu",

note = "Publisher Copyright: {\textcopyright} 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement.",

year = "2024",

month = jun,

day = "1",

doi = "10.1364/BOE.516764",

language = "English",

volume = "15",

pages = "3699--3714",

journal = "Biomedical Optics Express",

issn = "2156-7085",

publisher = "Optica Publishing Group (formerly OSA)",

number = "6",

}

TY - JOUR

T1 - Cross-modal attention network for retinal disease classification based on multi-modal images

AU - Liu, Zirong

AU - Hu, Yan

AU - Qiu, Zhongxi

AU - Niu, Yanyan

AU - Zhou, Dan

AU - Li, Xiaoling

AU - Shen, Junyong

AU - Jiang, Hongyang

AU - Li, Heng

AU - Liu, Jiang

PY - 2024/6/1

Y1 - 2024/6/1

N2 - Multi-modal eye disease screening improves diagnostic accuracy by providing lesion information from different sources. However, existing multi-modal automatic diagnosis methods tend to focus on the specificity of modalities and ignore the spatial correlation of images. This paper proposes a novel cross-modal retinal disease diagnosis network (CRD-Net) that digs out the relevant features from modal images aided for multiple retinal disease diagnosis. Specifically, our model introduces a cross-modal attention (CMA) module to query and adaptively pay attention to the relevant features of the lesion in the different modal images. In addition, we also propose multiple loss functions to fuse features with modality correlation and train a multi-modal retinal image classification network to achieve a more accurate diagnosis. Experimental evaluation on three publicly available datasets shows that our CRD-Net outperforms existing single-modal and multi-modal methods, demonstrating its superior performance.

AB - Multi-modal eye disease screening improves diagnostic accuracy by providing lesion information from different sources. However, existing multi-modal automatic diagnosis methods tend to focus on the specificity of modalities and ignore the spatial correlation of images. This paper proposes a novel cross-modal retinal disease diagnosis network (CRD-Net) that digs out the relevant features from modal images aided for multiple retinal disease diagnosis. Specifically, our model introduces a cross-modal attention (CMA) module to query and adaptively pay attention to the relevant features of the lesion in the different modal images. In addition, we also propose multiple loss functions to fuse features with modality correlation and train a multi-modal retinal image classification network to achieve a more accurate diagnosis. Experimental evaluation on three publicly available datasets shows that our CRD-Net outperforms existing single-modal and multi-modal methods, demonstrating its superior performance.

UR - http://www.scopus.com/inward/record.url?scp=85195065341&partnerID=8YFLogxK

U2 - 10.1364/BOE.516764

DO - 10.1364/BOE.516764

M3 - Article

AN - SCOPUS:85195065341

SN - 2156-7085

VL - 15

SP - 3699

EP - 3714

JO - Biomedical Optics Express

JF - Biomedical Optics Express

IS - 6

ER -

Cross-modal attention network for retinal disease classification based on multi-modal images

摘要

访问文件

其它文件与链接

指纹

引用此