TY - JOUR
T1 - IBFusion
T2 - An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy
AU - Bai, Yang
AU - Gao, Meijing
AU - Li, Shiyu
AU - Wang, Ping
AU - Guan, Ning
AU - Yin, Haozheng
AU - Yan, Yonghao
N1 - Publisher Copyright:
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
PY - 2024
Y1 - 2024
N2 - The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details. Both subjective and objective assessments are better than the other twelve advanced algorithms, proving our method’s effectiveness.
AB - The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details. Both subjective and objective assessments are better than the other twelve advanced algorithms, proving our method’s effectiveness.
KW - Deep learning
KW - bimodal feature extraction
KW - image fusion
KW - infrared and visible images
KW - infrared target mask
UR - http://www.scopus.com/inward/record.url?scp=85195363587&partnerID=8YFLogxK
U2 - 10.1109/TMM.2024.3410113
DO - 10.1109/TMM.2024.3410113
M3 - Article
AN - SCOPUS:85195363587
SN - 1520-9210
VL - 26
SP - 10610
EP - 10622
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -