TY - GEN
T1 - M2CNet
T2 - 36th Chinese Control and Decision Conference, CCDC 2024
AU - Ye, He
AU - Zhou, Zhiqiang
AU - Wang, Yuhao
AU - Chen, Weiyi
AU - Miao, Lingjuan
AU - Li, Jiaqi
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Since there is no ground truth for infrared and visible image fusion, where there are huge differences in appearance and content between the source images, most fusion methods are based on unsupervised learning. However, current used losses in unsupervised learning, i.e., pixel losses and structural losses, do not accurately characterize the modal differences between source images. And the complex losses, which combines multiple losses, show poor robustness on various datasets and lead to difficulty in model convergence. In addition, most methods struggle to achieve a satisfactory information balance between source images. To solve these challenges, we proposed a fusion method based on dual marginal contrastive learning, namely M2CNet. First, we propose a novel contrastive loss with dual margin penalties to enhance the ability of building cross-modal connections between fusion image and source image. It is important to note that our approach exclusively utilizes this succinct loss to address the fusion task without currently used losses. Second, we propose a three-branch network to fuse extensive complementary information from source images and make an excellent trade-off between them. Qualitative and quantitative experiments demonstrate the superiority of our method over the state-of-the-art methods.
AB - Since there is no ground truth for infrared and visible image fusion, where there are huge differences in appearance and content between the source images, most fusion methods are based on unsupervised learning. However, current used losses in unsupervised learning, i.e., pixel losses and structural losses, do not accurately characterize the modal differences between source images. And the complex losses, which combines multiple losses, show poor robustness on various datasets and lead to difficulty in model convergence. In addition, most methods struggle to achieve a satisfactory information balance between source images. To solve these challenges, we proposed a fusion method based on dual marginal contrastive learning, namely M2CNet. First, we propose a novel contrastive loss with dual margin penalties to enhance the ability of building cross-modal connections between fusion image and source image. It is important to note that our approach exclusively utilizes this succinct loss to address the fusion task without currently used losses. Second, we propose a three-branch network to fuse extensive complementary information from source images and make an excellent trade-off between them. Qualitative and quantitative experiments demonstrate the superiority of our method over the state-of-the-art methods.
KW - contrastive learning
KW - image fusion
KW - maginal softmax
KW - patchwise contrastive loss
UR - http://www.scopus.com/inward/record.url?scp=85200375339&partnerID=8YFLogxK
U2 - 10.1109/CCDC62350.2024.10587987
DO - 10.1109/CCDC62350.2024.10587987
M3 - Conference contribution
AN - SCOPUS:85200375339
T3 - Proceedings of the 36th Chinese Control and Decision Conference, CCDC 2024
SP - 396
EP - 402
BT - Proceedings of the 36th Chinese Control and Decision Conference, CCDC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 May 2024 through 27 May 2024
ER -