UAV detection in complex background with multi-scale feature fusion enhancement and channel-weight matching up-sampling

Huijuan Zhang, Kunpeng Li, Miaoxin Ji*, Zhenjiang Liu, Chi Zhang, Yuanjin Yu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The reckless flight of unmanned aerial vehicle (UAV) seriously threatens the public and aviation safety. Due to their small size and unobvious features, it remains a great challenge for the current detection algorithms to detect UAV, especially in complex backgrounds with backlighting. To address these issues, the multiscale feature fusion enhancement strategy and channel-weight matching (CWM) rule are proposed in this paper. A multiscale feature fusion enhancement strategy is presented to capture the multi-scale contextual information, which not only suppresses information conflicts but also enhances feature extraction capabilities. Then, an up-sampling method based on CWM is designed to enhance the sensitivity of small object, which uses different up-sampling techniques based on the importance level of each feature channel. Finally, a feature refinement module for small object is designed to further enhance the characterization of their features. The ablation and comparative experiments are carried out on the self-made UAV dataset. Compared to the original YOLOv5 algorithm, the proposed method shows an increase of 3.6% in mAP0.5 and 2.8% in mAP0.5:0.95, respectively. Moreover, the comparative experiments are implemented on the VisDrone2019 dataset, and the results indicate that the mAP0.5 and mAP0.5:0.95 of the proposed method also increase by 4.2% and 1.6%, respectively.

Original languageEnglish
Article number016009
JournalPhysica Scripta
Volume100
Issue number1
DOIs
Publication statusPublished - 1 Jan 2025

Keywords

  • channel-weight matching up-sampling
  • feature fusion
  • feature refinement
  • small object detection
  • unmanned aerial vehicle

Fingerprint

Dive into the research topics of 'UAV detection in complex background with multi-scale feature fusion enhancement and channel-weight matching up-sampling'. Together they form a unique fingerprint.

Cite this