IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy

Yang Bai; Meijing Gao; Shiyu Li; Ping Wang; Ning Guan; Haozheng Yin; Yonghao Yan

doi:10.1109/TMM.2024.3410113

IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy

Yang Bai, Meijing Gao, Shiyu Li, Ping Wang, Ning Guan, Haozheng Yin, Yonghao Yan

School of Integrated Circuits and Electronics

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details.

Original language	English
Pages (from-to)	1-13
Number of pages	13
Journal	IEEE Transactions on Multimedia
DOIs	https://doi.org/10.1109/TMM.2024.3410113
Publication status	Accepted/In press - 2024

Keywords

bimodal feature extraction
Data mining
Deep learning
Deep learning
Degradation
Feature extraction
Generative adversarial networks
image fusion
Image fusion
infrared and visible images
infrared target mask
Training

Access to Document

10.1109/TMM.2024.3410113

Cite this

@article{be17c029b6ec494fb26e2aea6b80bacf,

title = "IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy",

abstract = "The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details.",

keywords = "bimodal feature extraction, Data mining, Deep learning, Deep learning, Degradation, Feature extraction, Generative adversarial networks, image fusion, Image fusion, infrared and visible images, infrared target mask, Training",

author = "Yang Bai and Meijing Gao and Shiyu Li and Ping Wang and Ning Guan and Haozheng Yin and Yonghao Yan",

note = "Publisher Copyright: IEEE",

year = "2024",

doi = "10.1109/TMM.2024.3410113",

language = "English",

pages = "1--13",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - IBFusion

T2 - An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy

AU - Bai, Yang

AU - Gao, Meijing

AU - Li, Shiyu

AU - Wang, Ping

AU - Guan, Ning

AU - Yin, Haozheng

AU - Yan, Yonghao

N1 - Publisher Copyright: IEEE

PY - 2024

Y1 - 2024

N2 - The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details.

AB - The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details.

KW - bimodal feature extraction

KW - Data mining

KW - Deep learning

KW - Degradation

KW - Feature extraction

KW - Generative adversarial networks

KW - image fusion

KW - Image fusion

KW - infrared and visible images

KW - infrared target mask

KW - Training

UR - http://www.scopus.com/inward/record.url?scp=85195363587&partnerID=8YFLogxK

U2 - 10.1109/TMM.2024.3410113

DO - 10.1109/TMM.2024.3410113

M3 - Article

AN - SCOPUS:85195363587

SN - 1520-9210

SP - 1

EP - 13

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this