Abstract
In this Letter, we propose CLIP-guided multimodal registration and fusion (CGMRF), a semantic understanding-based multimodal image fusion system, for visible and infrared (IR) dual-modality imaging. CGMRF leverages semantic similarity, better aligned with human visual interpretation, to address the challenges of multimodal image registration and fusion. Experimental results across multiple metrics demonstrate the advantages of the proposed CGMRF system.
Original language | English |
---|---|
Pages (from-to) | 3907-3910 |
Number of pages | 4 |
Journal | Optics Letters |
Volume | 50 |
Issue number | 12 |
DOIs | |
Publication status | Published - 15 Jun 2025 |