TY - JOUR
T1 - Reinforcement Learning-Based Pathfinding for Multiple UAVs Facing Abrupt Hazardous Areas
AU - Wu, Qizhen
AU - Chen, Lei
AU - Liu, Kexin
AU - Lü, Jinhu
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Planning feasible paths for multiple uncrewed aerial vehicles (UAVs) amidst abrupt hazardous areas is a critical safety challenge, where existing methods often lack safety guarantees and uncertainty handling. To address this, we propose a novel multi-agent reinforcement learning (MARL) approach for the UAV pathfinding problem. Our method ensures rapid responsiveness and adherence to safety constraints through the integration of a control barrier function, guaranteeing safe replanning even during sudden route changes. To overcome the potential inefficiency of purely reactive safety, we introduce a probabilistic neural network that quantifies hazard uncertainty, enhancing the anticipation of sudden dangers. Finally, to utilize swarm intelligence for mutual risk avoidance, the approach incorporates neighbors’ observations using a proximity-weighted mean-field mechanism, allowing each UAV to consider the impact of this aggregated information in its planning. Extensive simulations show that our method achieves a planning success rate surpassing 90% in transient environments, outperforming traditional planners and other MARL baselines. Real-world experiments further validate the approach’s adaptability, demonstrating its practical value for safety-critical missions.
AB - Planning feasible paths for multiple uncrewed aerial vehicles (UAVs) amidst abrupt hazardous areas is a critical safety challenge, where existing methods often lack safety guarantees and uncertainty handling. To address this, we propose a novel multi-agent reinforcement learning (MARL) approach for the UAV pathfinding problem. Our method ensures rapid responsiveness and adherence to safety constraints through the integration of a control barrier function, guaranteeing safe replanning even during sudden route changes. To overcome the potential inefficiency of purely reactive safety, we introduce a probabilistic neural network that quantifies hazard uncertainty, enhancing the anticipation of sudden dangers. Finally, to utilize swarm intelligence for mutual risk avoidance, the approach incorporates neighbors’ observations using a proximity-weighted mean-field mechanism, allowing each UAV to consider the impact of this aggregated information in its planning. Extensive simulations show that our method achieves a planning success rate surpassing 90% in transient environments, outperforming traditional planners and other MARL baselines. Real-world experiments further validate the approach’s adaptability, demonstrating its practical value for safety-critical missions.
KW - Uncrewed aerial vehicle
KW - artificial intelligence
KW - deep reinforcement learning
KW - multi-agent system
KW - pathfinding
UR - https://www.scopus.com/pages/publications/105029619650
U2 - 10.1109/TASE.2026.3661266
DO - 10.1109/TASE.2026.3661266
M3 - Article
AN - SCOPUS:105029619650
SN - 1545-5955
VL - 23
SP - 4848
EP - 4860
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -