TY - JOUR
T1 - Optimization of high-speed fixed-wing UAV penetration strategy based on deep reinforcement learning
AU - Zhuang, Xing
AU - Li, Dongguang
AU - Wang, Yue
AU - Liu, Xinyu
AU - Li, Hanyu
N1 - Publisher Copyright:
© 2024 Elsevier Masson SAS
PY - 2024/5
Y1 - 2024/5
N2 - The penetration decision of unmanned aerial vehicles (UAVs) is one of the key components for UAVs to detect or strike crucial targets in adversarial environments. Presently, most methods lack research on penetration decisions under unknown interception information. The executing UAV for decision-making is generally a simplified three-degree-of-freedom model. The decision format is often simplified to path point optimization, lacking control over the stability of complex model attitudes and exhibiting relatively poor robustness. This paper proposes a penetration relative motion theory and aircraft control method based on the six-degrees-of-freedom UAV model to reduce response errors between algorithm control decisions and actual flight control. Using a Markov decision process, a UAV penetration decision control method based on an attitude-overload control loop is designed. This method combines reinforcement learning techniques to enable autonomous penetration decision-making and control for UAVs facing active defense scenarios, enhancing the autonomy, accuracy, and generalization of UAV online decision-making. The paper concludes by implementing autonomous decision-making and control for UAV penetration through the integration of interceptor trajectory prediction with proximal policy optimization. This method reduces the dependency of penetration strategies on interceptor information, ensuring both the generalization ability of penetration strategy models under uncertain information and the consistency of decision stability. Through simulations and experiments, we verified that the penetration strategy based on the pre-sampling PPO method improves the penetration effect by 15.44 % on average and the hit rate by 18.48 %. Moreover, the probability of penetration in different scenarios exceeds 70 %, and the penetration effect is improved by more than 10 % in the case of multiple interceptors. This paper also analyzes the impact of interceptor spatial distribution and the quantity of interceptors on the effectiveness of UAV penetration decision-making.
AB - The penetration decision of unmanned aerial vehicles (UAVs) is one of the key components for UAVs to detect or strike crucial targets in adversarial environments. Presently, most methods lack research on penetration decisions under unknown interception information. The executing UAV for decision-making is generally a simplified three-degree-of-freedom model. The decision format is often simplified to path point optimization, lacking control over the stability of complex model attitudes and exhibiting relatively poor robustness. This paper proposes a penetration relative motion theory and aircraft control method based on the six-degrees-of-freedom UAV model to reduce response errors between algorithm control decisions and actual flight control. Using a Markov decision process, a UAV penetration decision control method based on an attitude-overload control loop is designed. This method combines reinforcement learning techniques to enable autonomous penetration decision-making and control for UAVs facing active defense scenarios, enhancing the autonomy, accuracy, and generalization of UAV online decision-making. The paper concludes by implementing autonomous decision-making and control for UAV penetration through the integration of interceptor trajectory prediction with proximal policy optimization. This method reduces the dependency of penetration strategies on interceptor information, ensuring both the generalization ability of penetration strategy models under uncertain information and the consistency of decision stability. Through simulations and experiments, we verified that the penetration strategy based on the pre-sampling PPO method improves the penetration effect by 15.44 % on average and the hit rate by 18.48 %. Moreover, the probability of penetration in different scenarios exceeds 70 %, and the penetration effect is improved by more than 10 % in the case of multiple interceptors. This paper also analyzes the impact of interceptor spatial distribution and the quantity of interceptors on the effectiveness of UAV penetration decision-making.
KW - Guidance control
KW - Maneuvering strategy
KW - Penetration defense
KW - Reinforcement learning
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85189663621&partnerID=8YFLogxK
U2 - 10.1016/j.ast.2024.109089
DO - 10.1016/j.ast.2024.109089
M3 - Article
AN - SCOPUS:85189663621
SN - 1270-9638
VL - 148
JO - Aerospace Science and Technology
JF - Aerospace Science and Technology
M1 - 109089
ER -