TY - JOUR
T1 - VIFNet
T2 - An end-to-end visible–infrared fusion network for image dehazing
AU - Yu, Meng
AU - Cui, Te
AU - Lu, Haoyang
AU - Yue, Yufeng
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/9/28
Y1 - 2024/9/28
N2 - Image dehazing poses significant challenges in environmental perception. Recent research mainly focus on deep learning-based methods with single modality, while they may result in severe information loss especially in dense-haze scenarios. The infrared image exhibits robustness to the haze, however, existing methods have primarily treated the infrared modality as auxiliary information, failing to fully explore its rich information in dehazing. To address this challenge, the key insight of this study is to design a visible–infrared fusion network for image dehazing. In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module, which incorporates the Channel-Pixel Attention Block (CPAB) to restore more spatial and marginal information within the deep structural features. Additionally, we introduce an inconsistency weighted fusion strategy to merge the two modalities by leveraging the more reliable information. To validate this, we construct a visible–infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform. Extensive experiments performed on challenging real and simulated image datasets demonstrate that VIFNet can outperform many state-of-the-art competing methods. The code and dataset are available at https://github.com/mengyu212/VIFNet_dehazing.
AB - Image dehazing poses significant challenges in environmental perception. Recent research mainly focus on deep learning-based methods with single modality, while they may result in severe information loss especially in dense-haze scenarios. The infrared image exhibits robustness to the haze, however, existing methods have primarily treated the infrared modality as auxiliary information, failing to fully explore its rich information in dehazing. To address this challenge, the key insight of this study is to design a visible–infrared fusion network for image dehazing. In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module, which incorporates the Channel-Pixel Attention Block (CPAB) to restore more spatial and marginal information within the deep structural features. Additionally, we introduce an inconsistency weighted fusion strategy to merge the two modalities by leveraging the more reliable information. To validate this, we construct a visible–infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform. Extensive experiments performed on challenging real and simulated image datasets demonstrate that VIFNet can outperform many state-of-the-art competing methods. The code and dataset are available at https://github.com/mengyu212/VIFNet_dehazing.
KW - Inconsistency weight
KW - Multimodal image dehazing
KW - visible–infrared fusion
UR - http://www.scopus.com/inward/record.url?scp=85196776471&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2024.128105
DO - 10.1016/j.neucom.2024.128105
M3 - Article
AN - SCOPUS:85196776471
SN - 0925-2312
VL - 599
JO - Neurocomputing
JF - Neurocomputing
M1 - 128105
ER -