AdaptRGB-t: Adaptive RGB-t semantic segmentation via efficient parameter-tuning with textual guidance

  • Meng Yu
  • , Yufeng Yue*
  • , Yi Yang
  • , Mengyin Fu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Reliable semantic segmentation is essential for intelligent systems, yet significant problems remain: 1) Existing RGB-Thermal (RGB-T) segmentation models mainly rely on visual features and lack textual information, which may lead to inaccurate segmentation when categories share similar visual characteristics. 2) While SAM excels in instance-level segmentation, integrating it with thermal images and text is hindered by modality heterogeneity and computational inefficiency. Motivated by these observations, we introduce AdaptRGB-T, a parameter-efficient fine-tuning framework using Low-Rank Adaptation (LoRA) to adapt for RGB-T semantic segmentation. Specifically, we propose an Enhanced Transformer Block (ETB) that freezes SAM's original transformer blocks and incorporates trainable LoRA layers for efficient RGB-T feature fusion. Additionally, we incorporate CLIP-generated text embeddings in the mask decoder to enable semantic alignment, which further rectifies classification errors and improves semantic understanding accuracy. Experimental results across diverse datasets demonstrate that our method achieves superior performance in challenging scenarios with fewer trainable parameters. The code will be available at https://github.com/mengyu212/AdaptRGBT.

Original languageEnglish
Article number132060
JournalNeurocomputing
Volume664
DOIs
Publication statusPublished - 1 Feb 2026
Externally publishedYes

Keywords

  • LoRA fine-tuning
  • RGB-t semantic segmentation
  • Segment anything model
  • Textual guidance

Fingerprint

Dive into the research topics of 'AdaptRGB-t: Adaptive RGB-t semantic segmentation via efficient parameter-tuning with textual guidance'. Together they form a unique fingerprint.

Cite this