Multi-Modal Domain Generalization for Cross-Scene Hyperspectral Image Classification

Yuxiang Zhang, Mengmeng Zhang*, Wei Li, Ran Tao

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

The large-scale pre-training image-text foundation models have excelled in a number of downstream applications. The majority of domain generalization techniques, however, have never focused on mining linguistic modal knowledge to enhance model generalization performance. Additionally, text information has been ignored in hyperspectral image classification (HSI) tasks. To address the aforementioned shortcomings, a Multi-modal Domain Generalization Network (MDG) is proposed to learn cross-domain invariant representation from cross-domain shared semantic space. Only the source domain (SD) is used for training in the proposed method, after which the model is directly transferred to the target domain (TD). Visual and linguistic features are extracted using the dual-stream architecture, which consists of an image encoder and a text encoder. A generator is designed to obtain extended domain (ED) samples that are different from SD. Furthermore, linguistic features are used to construct a cross-domain shared semantic space, where visual-linguistic alignment is accomplished by supervised contrastive learning. Extensive experiments on two datasets show that the proposed method outperforms state-of-the-art approaches.

源语言英语
主期刊名ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781728163277
DOI
出版状态已出版 - 2023
活动48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, 希腊
期限: 4 6月 202310 6月 2023

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2023-June
ISSN(印刷版)1520-6149

会议

会议48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
国家/地区希腊
Rhodes Island
时期4/06/2310/06/23

指纹

探究 'Multi-Modal Domain Generalization for Cross-Scene Hyperspectral Image Classification' 的科研主题。它们共同构成独一无二的指纹。

引用此