Multispectral Feature-Fusion Object Detection Based on Multi-Scale Auxiliary Branch

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Object detection in real-world environments is inherently challenging due to various factors, such as poor illumination conditions, which can significantly hinder detection accuracy. To overcome the above challenges, we propose a multispectral feature-fusion object detection framework by designing a dual-modal backbone based on YOLOv8. Within the backbone, we propose a Feature Reconstruction Fusion module based on the Swin Transformer, which reconstructs dual-modal features by considering the intensity of spatial information and learns complementary inter-modal information. Moreover, we design multi-scale auxiliary branches during the training stage to complement the single-modal gradient information at different levels, strengthening the model's ability to recognize multi-scale targets. Our method is tested on several public datasets, including the FLIR-aligned, LLVIP, and M3FD datasets. The results show that our network not only achieves remarkable accuracy in detecting small-sized targets but also outperforms other state-of-the-art networks in mean average precision.

Original languageEnglish
Title of host publicationProceedings of the 44th Chinese Control Conference, CCC 2025
EditorsJian Sun, Hongpeng Yin
PublisherIEEE Computer Society
Pages7727-7732
Number of pages6
ISBN (Electronic)9789887581611
DOIs
Publication statusPublished - 2025
Event44th Chinese Control Conference, CCC 2025 - Chongqing, China
Duration: 28 Jul 202530 Jul 2025

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference44th Chinese Control Conference, CCC 2025
Country/TerritoryChina
CityChongqing
Period28/07/2530/07/25

Keywords

  • auxiliary branch
  • multispectral object detection
  • Swin Transformer
  • YOLOv8

Fingerprint

Dive into the research topics of 'Multispectral Feature-Fusion Object Detection Based on Multi-Scale Auxiliary Branch'. Together they form a unique fingerprint.

Cite this