PromptFusion: Harmonized Semantic Prompt Learning for Infrared and Visible Image Fusion

  • Jinyuan Liu
  • , Xingyuan Li
  • , Zirui Wang
  • , Zhiying Jiang
  • , Wei Zhong
  • , Wei Fan
  • , Bin Xu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)

Abstract

The goal of infrared and visible image fusion (IVIF) is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene. However, existing methods struggle to effectively handle modal disparities, resulting in visual degradation of the details and prominent targets of the fused images. To address these challenges, we introduce PromptFusion, a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts. Firstly, to better characterize the features of different modalities, a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities, thereby improving the extraction of fine details and textures. We also introduce a prompt learning mechanism using positive and negative prompts, leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images, leading to improved performance in downstream tasks. Furthermore, we employ bi-level asymptotic convergence optimization. This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent. Our approach advances the state-of-the-art, delivering superior fusion quality and boosting the performance of related downstream tasks.

Original languageEnglish
Pages (from-to)502-515
Number of pages14
JournalIEEE/CAA Journal of Automatica Sinica
Volume12
Issue number3
DOIs
Publication statusPublished - 2025

Keywords

  • Bi-level optimization
  • image fusion
  • infrared and visible image
  • prompt learning

Fingerprint

Dive into the research topics of 'PromptFusion: Harmonized Semantic Prompt Learning for Infrared and Visible Image Fusion'. Together they form a unique fingerprint.

Cite this