Abstract
Highlights: What are the main findings? The study identifies spectral homogenization as a primary bottleneck in aerial detection. It demonstrates that the proposed Hierarchical Granularity Block (HG-Block) and Cross-Stage Context Modulation (CSCM) effectively preserve fine-grained details while filtering background clutter. Extensive experiments on the DUT Anti-UAV and Anti-UAV datasets reveal that Eagle-YOLO achieves a superior speed–accuracy tradeoff, with the lightweight variant surpassing the robust RTMDet-T baseline by 1.67% AP while maintaining a real-time inference speed of 141 FPS. What are the implications of the main finding? The results validate that dynamically aligning receptive fields via Scale-Adaptive Heterogeneous Convolution (SAHC) is critical for distinguishing minute mechanical drones from biological distractors such as birds, thereby challenging the dominance of homogeneous convolutions in real-time detectors. The proposed framework offers a practical solution for low-altitude airspace security, proving highly effective for deployment on battery-powered edge monitoring platforms that demand uncompromising precision under strict computational constraints. Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery presents unique challenges, primarily characterized by extreme scale variations and intense background clutter. Existing detectors often suffer from spectral homogenization in which the critical high-frequency details of minute targets are washed out by dominant background signals during feature downsampling. To address this, we propose Eagle-YOLO, a dynamic feature aggregation framework designed to master these complexities without compromising inference speed. We introduce three core innovations: (1) the Hierarchical Granularity Block (HG-Block), which employs a residual granularity injection pathway to function as a detail anchor for tiny objects while simultaneously accumulating semantics for large structures; (2) the Cross-Stage Context Modulation (CSCM) mechanism, which leverages a global context query to filter background redundancy and recalibrate features across network stages; and (3) the Scale-Adaptive Heterogeneous Convolution (SAHC) strategy, which dynamically aligns receptive fields with the inherent scale distribution of aerial data. Extensive experiments on the DUT Anti-UAV dataset demonstrate that Eagle-YOLO achieves a remarkable balance between accuracy and latency. Specifically, our lightweight Eagle-YOLO-T variant achieves 74.62% AP, surpassing the robust baseline RTMDet-T by 1.67% while maintaining a real-time inference speed of 141 FPS on an NVIDIA RTX 4090 GPU. Furthermore, on the challenging Anti-UAV dataset, our Eagle-YOLOv8-M variant reaches an impressive 94.38% (Formula presented.), outperforming the standard YOLOv8-M by 2.83% and proving its efficacy for edge-deployed aerial surveillance applications.
| Original language | English |
|---|---|
| Article number | 112 |
| Journal | Drones |
| Volume | 10 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - Feb 2026 |
| Externally published | Yes |
Keywords
- UAV object detection
- feature aggregation
- multi-granularity
- real-time detection
Fingerprint
Dive into the research topics of 'Eagle-YOLO: Enhancing Real-Time Small Object Detection in UAVs via Multi-Granularity Feature Aggregation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver