TY - GEN
T1 - Classification of Multisource Remote Sensing Images Using Multimodal Equilateral Absorption Network
AU - Zhao, Yuyang
AU - Zhang, Mengmeng
AU - Gao, Yunhao
AU - Li, Wei
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/1/19
Y1 - 2024/1/19
N2 - Fusing multisource remote sensing data is an important approach to improve pixel-wise classification performance. Generally, the richer the information input into the model, the more diverse the knowledge it can learn, thereby improving classification performance. However, existing fusion methods are usually only applicable to two modal inputs and find it difficult to balance the consistency and diversity of multisource features. In this paper, we propose a novel classification network named multimodal equilateral absorption network (MEANet) which can fuse multiple kinds of remote sensing images. Specifically, three modal features are firstly extracted by a three-branch CNN. Then, the cross-modal interacting module (CIM) is utilized to realize feature fusion on the multimodal features. Thirdly, the improved triplet loss is designed to make a tradeoff between feature diversity and consistency, thus making the network acquire multisource information more efficiently. Finally, pixel-wise summation and a fully connected (FC) layer are utilized to obtain the final classification results. Experiments on two datasets show that the proposed MEANet has a competitive classification performance compared to several state-of-the-art methods.
AB - Fusing multisource remote sensing data is an important approach to improve pixel-wise classification performance. Generally, the richer the information input into the model, the more diverse the knowledge it can learn, thereby improving classification performance. However, existing fusion methods are usually only applicable to two modal inputs and find it difficult to balance the consistency and diversity of multisource features. In this paper, we propose a novel classification network named multimodal equilateral absorption network (MEANet) which can fuse multiple kinds of remote sensing images. Specifically, three modal features are firstly extracted by a three-branch CNN. Then, the cross-modal interacting module (CIM) is utilized to realize feature fusion on the multimodal features. Thirdly, the improved triplet loss is designed to make a tradeoff between feature diversity and consistency, thus making the network acquire multisource information more efficiently. Finally, pixel-wise summation and a fully connected (FC) layer are utilized to obtain the final classification results. Experiments on two datasets show that the proposed MEANet has a competitive classification performance compared to several state-of-the-art methods.
KW - Feature fusion
KW - improved triplet loss
KW - multimodal classification
KW - multisource remote sensing
UR - http://www.scopus.com/inward/record.url?scp=85192789372&partnerID=8YFLogxK
U2 - 10.1145/3647649.3647680
DO - 10.1145/3647649.3647680
M3 - Conference contribution
AN - SCOPUS:85192789372
T3 - ACM International Conference Proceeding Series
SP - 185
EP - 191
BT - ICIGP 2024 - Proceedings of the 2024 7th International Conference on Image and Graphics Processing
PB - Association for Computing Machinery
T2 - 7th International Conference on Image and Graphics Processing, ICIGP 2024
Y2 - 19 January 2024 through 21 January 2024
ER -