MGT: Modality-Guided Transformer for Infrared and Visible Image Fusion

Taoying Zhang, Hesong Li, Qiankun Liu, Xiaoyong Wang*, Ying Fu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Infrared and visible image fusion aims to generate high-quality fused images containing thermal radiation information from infrared images and texture information from visible images. Most deep learning-based methods are simple stacks of Transformer or convolution blocks and fail to further integrate the feature information of source images that may be missed in the fusion stage after generating the fused features. In this work, we develop a cross-attention-based macro framework, named Modality-Guided Transformer (MGT), that reintroduces detailed information from the two input images across multiple feature extraction layers into the initially obtained fused image. For efficiency, our MGT also introduces shared attention and multi-scale windows to reduce the computational costs of attention. Experimental results show that the proposed MGT outperforms state-of-the-art methods, especially in preserving salient targets and infrared texture details. Our code is publicly available at https://github.com/TaoYing-Zhang/MGT.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
EditorsQingshan Liu, Hanzi Wang, Rongrong Ji, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang
PublisherSpringer Science and Business Media Deutschland GmbH
Pages321-332
Number of pages12
ISBN (Print)9789819984282
DOIs
Publication statusPublished - 2024
Event6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 - Xiamen, China
Duration: 13 Oct 202315 Oct 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14425 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
Country/TerritoryChina
CityXiamen
Period13/10/2315/10/23

Keywords

  • Cross-attention
  • Infrared and visible image fusion
  • Modality-guided
  • Transformer

Fingerprint

Dive into the research topics of 'MGT: Modality-Guided Transformer for Infrared and Visible Image Fusion'. Together they form a unique fingerprint.

Cite this