TY - JOUR
T1 - Reinforcement Learning-Based Method for Collaborative Target Assignment Against Heterogeneous UAV Swarms
AU - Wei, Haoming
AU - Wang, Zhengjie
AU - Wang, Yue
AU - Hong, Xiaotong
N1 - Publisher Copyright:
© 2026, Beijing Institute of Technology. All Rights Reserved.
PY - 2026
Y1 - 2026
N2 - To address the challenge posed by saturated attacks of drone swarms to air defense systems, and to achieve the winning goal of “using swarms to counter swarms”, a cooperative target assignment method based on proximal policy optimization (PPO) was proposed. The approach incorporated an attention mechanism to capture interaction features between intercepting and target clusters, enhancing the model’s situational awareness. A hierarchical masking mechanism was also introduced to handle variable-scale target clusters, dynamically screen available interceptors, and avoid fire overlap, thereby satisfying cooperative constraints. Experiments demonstrate that the method maintains good generalization and robustness in complex adversarial scenarios, offering a new solution for intelligent target assignment under dynamic threats.
AB - To address the challenge posed by saturated attacks of drone swarms to air defense systems, and to achieve the winning goal of “using swarms to counter swarms”, a cooperative target assignment method based on proximal policy optimization (PPO) was proposed. The approach incorporated an attention mechanism to capture interaction features between intercepting and target clusters, enhancing the model’s situational awareness. A hierarchical masking mechanism was also introduced to handle variable-scale target clusters, dynamically screen available interceptors, and avoid fire overlap, thereby satisfying cooperative constraints. Experiments demonstrate that the method maintains good generalization and robustness in complex adversarial scenarios, offering a new solution for intelligent target assignment under dynamic threats.
KW - air defense interception
KW - dynamic target assignment
KW - proximal policy optimization (PPO)
KW - self-attention mechanism
UR - https://www.scopus.com/pages/publications/105038736676
U2 - 10.15918/j.tbit1001-0645.2026.003
DO - 10.15918/j.tbit1001-0645.2026.003
M3 - Article
AN - SCOPUS:105038736676
SN - 1001-0645
VL - 46
SP - 527
EP - 533
JO - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
JF - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
IS - 5
ER -