Infrared and Visible Image Fusion with Overlapped Window Transformer

Research output: Contribution to journalArticlepeer-review

Abstract

An overlap window-based transformer is proposed for infrared and visible image fusion. A multi-head selfattention mechanism based on overlapping windows is designed. By introducing overlapping regions between windows, local features can interact across different windows, avoiding the discontinuity and information isolation issues caused by non-overlapping partitions. The proposed model is trained using an unsupervised loss function composed of three terms: pixel, gradient, and structural loss. With the end-to-end model and the unsupervised loss function, our method eliminates the need to manually design complex activity-level measurements and fusion strategies. Extensive experiments on the public TNO (grayscale) and RoadScene (RGB) datasets demonstrate that the proposed method achieves the expected long-distance dependency modeling capabilities when fusing infrared and visible images, as well as the positive results in both qualitative and quantitative evaluations.

Original languageEnglish
Pages (from-to)838-846
Number of pages9
JournalJournal of Advanced Computational Intelligence and Intelligent Informatics
Volume29
Issue number4
DOIs
Publication statusPublished - Jul 2025
Externally publishedYes

Keywords

  • image fusion
  • infrared image
  • local self-attention
  • transformer

Fingerprint

Dive into the research topics of 'Infrared and Visible Image Fusion with Overlapped Window Transformer'. Together they form a unique fingerprint.

Cite this