Abstract
Vision-based air-to-air tracking of multi-UAV swarms is crucial for effective swarm perception. UAVs in a swarm typically share similar appearance features and exhibit nonlinear motion, as they are usually of the same type. This homogeneity poses challenges for the existing multi-object tracking (MOT) algorithms, which often suffer performance degradation due to the difficulties in capturing instance-specific appearance and motion cues. In this paper, we propose a novel multi-frame pose-attention-based appearance feature extraction component that captures instance-level pose features of UAVs across consecutive frames. Additionally, we introduce a motion difference accumulation strategy to extract spatial and motion cues from multiple adjacent frames. By combining these techniques, we design a multi-frame association framework that effectively distinguishes between similar UAVs in a swarm by leveraging object features over consecutive frames. To address the lack of relevant datasets, we create the AIRMOT dataset, specifically tailored for air-to-air tracking of homogeneous UAV swarms. Our method is evaluated on the AIRMOT dataset as well as the publicly available MOT-FLY and UAVSwarm datasets. The experimental results demonstrate that our approach outperforms other state-of-the-art (SOTA) methods, delivering superior tracking performance.
| Original language | English |
|---|---|
| Article number | 103558 |
| Journal | Chinese Journal of Aeronautics |
| Volume | 38 |
| Issue number | 12 |
| DOIs | |
| Publication status | Published - Dec 2025 |
Keywords
- Convolutional neural networks
- Feature extraction
- Swarm intelligence
- Target tracking
- Unmanned aerial vehicles
Fingerprint
Dive into the research topics of 'Vision-based swarm tracking of multiple UAVs in air-to-air scenarios'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver