W-shaped network: a lightweight network for real-time infrared and visible image fusion

Tingting Zhang; Huiqian Du; Min Xie

doi:10.1117/1.JEI.32.6.063005

W-shaped network: a lightweight network for real-time infrared and visible image fusion

Tingting Zhang, Huiqian Du^*, Min Xie

^*Corresponding author for this work

School of Integrated Circuits and Electronics

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.

Original language	English
Article number	063005
Journal	Journal of Electronic Imaging
Volume	32
Issue number	6
DOIs	https://doi.org/10.1117/1.JEI.32.6.063005
Publication status	Published - 1 Nov 2023

Keywords

autoencoder
image fusion
lightweight network
multi-scale features

Access to Document

10.1117/1.JEI.32.6.063005

Cite this

Zhang, T., Du, H., & Xie, M. (2023). W-shaped network: a lightweight network for real-time infrared and visible image fusion. Journal of Electronic Imaging, 32(6), Article 063005. https://doi.org/10.1117/1.JEI.32.6.063005

@article{baa5ec1580b54f958dee9d188c319057,

title = "W-shaped network: a lightweight network for real-time infrared and visible image fusion",

abstract = "Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.",

keywords = "autoencoder, image fusion, lightweight network, multi-scale features",

author = "Tingting Zhang and Huiqian Du and Min Xie",

note = "Publisher Copyright: {\textcopyright} 2023 SPIE and IS&T.",

year = "2023",

month = nov,

day = "1",

doi = "10.1117/1.JEI.32.6.063005",

language = "English",

volume = "32",

journal = "Journal of Electronic Imaging",

issn = "1017-9909",

publisher = "SPIE",

number = "6",

}

TY - JOUR

T1 - W-shaped network

T2 - a lightweight network for real-time infrared and visible image fusion

AU - Zhang, Tingting

AU - Du, Huiqian

AU - Xie, Min

PY - 2023/11/1

Y1 - 2023/11/1

N2 - Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.

AB - Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.

KW - autoencoder

KW - image fusion

KW - lightweight network

KW - multi-scale features

UR - http://www.scopus.com/inward/record.url?scp=85179465250&partnerID=8YFLogxK

U2 - 10.1117/1.JEI.32.6.063005

DO - 10.1117/1.JEI.32.6.063005

M3 - Article

AN - SCOPUS:85179465250

SN - 1017-9909

VL - 32

JO - Journal of Electronic Imaging

JF - Journal of Electronic Imaging

IS - 6

M1 - 063005

ER -

W-shaped network: a lightweight network for real-time infrared and visible image fusion

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this