TY - JOUR
T1 - Prior-Guided RGB-T Object Detection for UAVs
AU - Yu, Lingyi
AU - Wang, Zhengjie
AU - Zhan, Zhaohuan
N1 - Publisher Copyright:
© 2026, Beijing Institute of Technology. All Rights Reserved.
PY - 2026
Y1 - 2026
N2 - RGB-T object detection enhances accuracy and robustness in complex environments by fusing complementary information from visible and thermal infrared images. For practical UAV applications, two core challenges arise: ① intra-modality: visible images captured at night or in bad weather suffer severe degradation and detail loss; ② inter-modality: due to varying perspectives, small targets, and complex backgrounds, target information is hard to align during cross-modal fusion, leading to high noise and difficulty in detecting small targets. To address these, a prior-guided improved scheme for UAVs was proposed. To address the intra-modality problem, a pre-trained low-light enhancement prior was used to enhance low-light RGB images in the spatial domain, restoring details. To address the inter-modality problem, a human attention prior was introduced to design a lightweight foreground discrimination branch, which helped the model focus on target regions via multi-task learning, reducing background noise. Experimental results show that the framework achieves robust detection in complex scenarios with varying illumination and multi-scale targets, providing reliable multi-modal detection support for low-altitude intelligent perception.
AB - RGB-T object detection enhances accuracy and robustness in complex environments by fusing complementary information from visible and thermal infrared images. For practical UAV applications, two core challenges arise: ① intra-modality: visible images captured at night or in bad weather suffer severe degradation and detail loss; ② inter-modality: due to varying perspectives, small targets, and complex backgrounds, target information is hard to align during cross-modal fusion, leading to high noise and difficulty in detecting small targets. To address these, a prior-guided improved scheme for UAVs was proposed. To address the intra-modality problem, a pre-trained low-light enhancement prior was used to enhance low-light RGB images in the spatial domain, restoring details. To address the inter-modality problem, a human attention prior was introduced to design a lightweight foreground discrimination branch, which helped the model focus on target regions via multi-task learning, reducing background noise. Experimental results show that the framework achieves robust detection in complex scenarios with varying illumination and multi-scale targets, providing reliable multi-modal detection support for low-altitude intelligent perception.
KW - RGB-T object detection
KW - UAV object detection
KW - multimodal fusion
KW - small object detection
UR - https://www.scopus.com/pages/publications/105038693055
U2 - 10.15918/j.tbit1001-0645.2025.169
DO - 10.15918/j.tbit1001-0645.2025.169
M3 - Article
AN - SCOPUS:105038693055
SN - 1001-0645
VL - 46
SP - 470
EP - 479
JO - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
JF - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
IS - 5
ER -