TY - JOUR
T1 - FGHDet
T2 - Delving into Fine-Grained Features with Head Selection for UAV Object Detection
AU - Bi, Yan Chao
AU - Ning, Yang
AU - Nie, Xiu Shan
AU - Lu, Xian Kai
AU - Zhang, Rui Heng
AU - Zhang, Huan Long
N1 - Publisher Copyright:
© Institute of Computing Technology, Chinese Academy of Sciences 2025.
PY - 2025/9
Y1 - 2025/9
N2 - Detecting small objects in unmanned aerial vehicle (UAV) imagery is a challenging and crucial task in computer vision. Most current methods struggle to address the challenges of small objects: fine-grained feature mining, multiple-layer feature fusion, and mismatches in scale between anchors and feature maps. To alleviate the aforementioned issues, we present FGHDet, which focuses on delving into fine-grained features in low-level features with a head selection mechanism. First, our approach introduces a detail-preserving semantic information enhancement module (DSIEM) to retain fine-grained information while excavating coarse-grained semantic details relevant to fine-grained information. Then, we devise a coarse-to-fine feature guidance module (CFGM) that leverages coarse-grained semantic information and fine-grained information to co-guide feature enhancement, further improving the model’s classification ability. Finally, we introduce a multiscale detection strategy based on anchor-head matching, ensuring scale-level matching between anchors and feature maps to prevent overfitting due to overly fine anchor divisions. Extensive experiments on the VisDrone, CARPK, and Drone-vs.-Bird datasets demonstrate that FGHDet achieves notable improvements in mAP (IoU range [0.5: 0.95]) of 4.9, 4.1, and 2.2, respectively. The code is available at https://github.com/b-yanchao/UAVDetection.git.
AB - Detecting small objects in unmanned aerial vehicle (UAV) imagery is a challenging and crucial task in computer vision. Most current methods struggle to address the challenges of small objects: fine-grained feature mining, multiple-layer feature fusion, and mismatches in scale between anchors and feature maps. To alleviate the aforementioned issues, we present FGHDet, which focuses on delving into fine-grained features in low-level features with a head selection mechanism. First, our approach introduces a detail-preserving semantic information enhancement module (DSIEM) to retain fine-grained information while excavating coarse-grained semantic details relevant to fine-grained information. Then, we devise a coarse-to-fine feature guidance module (CFGM) that leverages coarse-grained semantic information and fine-grained information to co-guide feature enhancement, further improving the model’s classification ability. Finally, we introduce a multiscale detection strategy based on anchor-head matching, ensuring scale-level matching between anchors and feature maps to prevent overfitting due to overly fine anchor divisions. Extensive experiments on the VisDrone, CARPK, and Drone-vs.-Bird datasets demonstrate that FGHDet achieves notable improvements in mAP (IoU range [0.5: 0.95]) of 4.9, 4.1, and 2.2, respectively. The code is available at https://github.com/b-yanchao/UAVDetection.git.
KW - anchor-head-based scale-level matching
KW - drone-view image
KW - fine-grained information extraction
KW - learning fine-grained semantics
UR - https://www.scopus.com/pages/publications/105022623184
U2 - 10.1007/s11390-025-5252-z
DO - 10.1007/s11390-025-5252-z
M3 - Article
AN - SCOPUS:105022623184
SN - 1000-9000
VL - 40
SP - 1301
EP - 1315
JO - Journal of Computer Science and Technology
JF - Journal of Computer Science and Technology
IS - 5
ER -