Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

Yunhui Qin, Zhifang Xing, Xulong Li, Zhongshan Zhang*, Haijun Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.

Original languageEnglish
Pages (from-to)19641-19652
Number of pages12
JournalIEEE Transactions on Vehicular Technology
Volume73
Issue number12
DOIs
Publication statusPublished - 2024

Keywords

  • Unmanned aerial vehicle (UAV)
  • constrained Markov decision process
  • deep reinforcement learning
  • periodic coverage evaluation
  • primal-dual optimization

Fingerprint

Dive into the research topics of 'Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications'. Together they form a unique fingerprint.

Cite this