TY - JOUR
T1 - W-shaped network
T2 - a lightweight network for real-time infrared and visible image fusion
AU - Zhang, Tingting
AU - Du, Huiqian
AU - Xie, Min
N1 - Publisher Copyright:
© 2023 SPIE and IS&T.
PY - 2023/11/1
Y1 - 2023/11/1
N2 - Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.
AB - Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.
KW - autoencoder
KW - image fusion
KW - lightweight network
KW - multi-scale features
UR - http://www.scopus.com/inward/record.url?scp=85179465250&partnerID=8YFLogxK
U2 - 10.1117/1.JEI.32.6.063005
DO - 10.1117/1.JEI.32.6.063005
M3 - Article
AN - SCOPUS:85179465250
SN - 1017-9909
VL - 32
JO - Journal of Electronic Imaging
JF - Journal of Electronic Imaging
IS - 6
M1 - 063005
ER -