A Reinforcement Learning-Enhanced Spoofing Algorithm for UAV With GPS/INS-Integrated Navigation

Xiaomeng Ma, Taohan Sun, Meiguo Gao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

This paper optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Considering that UAV flight control commands are related only to the current positioning results, the positioning information here can originate from either authentic or counterfeit satellite signals, making the navigation deception process satisfy Markov properties. Subsequently, the paper provides theoretical evidence of the Gaussian random distribution properties of navigational spoofing positions based on radar Kalman Filter (KF) estimation, and introduces chi-square distributed random variables to enforce constraints on the stealthiness and stability of navigational spoofing. Based on these characteristic constraints, a reward function that simultaneously considers navigation deception position concealment, deception trajectory stability, and the successful navigation of the victim UAV to the desired destination is formulated. To comprehensively achieve these objectives, we introduce SIE to characterize the positional correlation between the deception location, the actual destination, and the deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. In the absence of prior knowledge of the UAV's preset reference trajectory and internal KF parameters, comparative results demonstrate that the introduction of SIE enhances the concealment of the deception position. Additionally, ablation experiments validate the impact of the constraint formulation on the stability of the deceptive trajectory, and the covert navigation spoofing effect of SIE-SAC is easily extended to three-dimensional space.

Original languageEnglish
JournalIEEE Transactions on Aerospace and Electronic Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • deep reinforcement learning
  • GPS
  • information entropy
  • navigation spoofing
  • UAV

Cite this