TY - JOUR
T1 - A Reinforcement Learning-Enhanced Spoofing Algorithm for UAV With GPS/INS-Integrated Navigation
AU - Ma, Xiaomeng
AU - Sun, Taohan
AU - Gao, Meiguo
N1 - Publisher Copyright:
© 1965-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Considering that UAV flight control commands are related only to the current positioning results, the positioning information here can originate from either authentic or counterfeit satellite signals, making the navigation deception process satisfy Markov properties. Subsequently, the paper provides theoretical evidence of the Gaussian random distribution properties of navigational spoofing positions based on radar Kalman Filter (KF) estimation, and introduces chi-square distributed random variables to enforce constraints on the stealthiness and stability of navigational spoofing. Based on these characteristic constraints, a reward function that simultaneously considers navigation deception position concealment, deception trajectory stability, and the successful navigation of the victim UAV to the desired destination is formulated. To comprehensively achieve these objectives, we introduce SIE to characterize the positional correlation between the deception location, the actual destination, and the deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. In the absence of prior knowledge of the UAV's preset reference trajectory and internal KF parameters, comparative results demonstrate that the introduction of SIE enhances the concealment of the deception position. Additionally, ablation experiments validate the impact of the constraint formulation on the stability of the deceptive trajectory, and the covert navigation spoofing effect of SIE-SAC is easily extended to three-dimensional space.
AB - This paper optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Considering that UAV flight control commands are related only to the current positioning results, the positioning information here can originate from either authentic or counterfeit satellite signals, making the navigation deception process satisfy Markov properties. Subsequently, the paper provides theoretical evidence of the Gaussian random distribution properties of navigational spoofing positions based on radar Kalman Filter (KF) estimation, and introduces chi-square distributed random variables to enforce constraints on the stealthiness and stability of navigational spoofing. Based on these characteristic constraints, a reward function that simultaneously considers navigation deception position concealment, deception trajectory stability, and the successful navigation of the victim UAV to the desired destination is formulated. To comprehensively achieve these objectives, we introduce SIE to characterize the positional correlation between the deception location, the actual destination, and the deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. In the absence of prior knowledge of the UAV's preset reference trajectory and internal KF parameters, comparative results demonstrate that the introduction of SIE enhances the concealment of the deception position. Additionally, ablation experiments validate the impact of the constraint formulation on the stability of the deceptive trajectory, and the covert navigation spoofing effect of SIE-SAC is easily extended to three-dimensional space.
KW - deep reinforcement learning
KW - GPS
KW - information entropy
KW - navigation spoofing
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85218893106&partnerID=8YFLogxK
U2 - 10.1109/TAES.2025.3545388
DO - 10.1109/TAES.2025.3545388
M3 - Article
AN - SCOPUS:85218893106
SN - 0018-9251
JO - IEEE Transactions on Aerospace and Electronic Systems
JF - IEEE Transactions on Aerospace and Electronic Systems
ER -