Target detection based on multi-scale feature fusion and cross-channel interactive attention mechanism

Chenyang Zhao, Yong Song*, Xin Yang, Ya Zhou*, Jinqi Yang

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

3 Citations (Scopus)

Abstract

Aiming at the problems of complex background, target scale change and small target in aerial image detection, we propose a YOLOv5 target detection algorithm based on multi-scale feature fusion and cross-channel interactive attention mechanism. Including: M-PPM (Multi-scale pyramid pooling module) is designed as a replacement for the SPP (Spatial Pyramid Pooling) structure in YOLOv5, so as to make full use of different scale features to fuse global feature information; CCA (Cross-channel interactive attention mechanism) is designed to realize cross-channel information interaction and utilization, and enhance the network's capability to generalize and fusion efficiency of small target features. Bi-directional Feature Pyramid Network (BiFPN) is utilized to solve scale difference problem in multi-target detection. The proposed algorithm's experimental results is 2.3 % and 1.8 % higher than YOLOv5 on the VisDrone and UAVDT aerial data sets, respectively.

Original languageEnglish
Article number012046
JournalJournal of Physics: Conference Series
Volume2562
Issue number1
DOIs
Publication statusPublished - 2023
Event2023 3rd International Conference on Artificial Intelligence and Industrial Technology Applications, AIITA 2023 - Suzhou, China
Duration: 24 Mar 202326 Mar 2023

Fingerprint

Dive into the research topics of 'Target detection based on multi-scale feature fusion and cross-channel interactive attention mechanism'. Together they form a unique fingerprint.

Cite this