Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

Yunhui Qin; Zhifang Xing; Xulong Li; Zhongshan Zhang; Haijun Zhang

doi:10.1109/TVT.2024.3450956

Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

Yunhui Qin, Zhifang Xing, Xulong Li, Zhongshan Zhang^*, Haijun Zhang

^*Corresponding author for this work

School of Cyberspace Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.

Original language	English
Pages (from-to)	19641-19652
Number of pages	12
Journal	IEEE Transactions on Vehicular Technology
Volume	73
Issue number	12
DOIs	https://doi.org/10.1109/TVT.2024.3450956
Publication status	Published - 2024

Keywords

Unmanned aerial vehicle (UAV)
constrained Markov decision process
deep reinforcement learning
periodic coverage evaluation
primal-dual optimization

Access to Document

10.1109/TVT.2024.3450956

Cite this

@article{c30944f0f27d48d8a93d451ef95bf260,

title = "Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications",

abstract = "Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.",

keywords = "Unmanned aerial vehicle (UAV), constrained Markov decision process, deep reinforcement learning, periodic coverage evaluation, primal-dual optimization",

author = "Yunhui Qin and Zhifang Xing and Xulong Li and Zhongshan Zhang and Haijun Zhang",

note = "Publisher Copyright: {\textcopyright} 1967-2012 IEEE.",

year = "2024",

doi = "10.1109/TVT.2024.3450956",

language = "English",

volume = "73",

pages = "19641--19652",

journal = "IEEE Transactions on Vehicular Technology",

issn = "0018-9545",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "12",

}

TY - JOUR

T1 - Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

AU - Qin, Yunhui

AU - Xing, Zhifang

AU - Li, Xulong

AU - Zhang, Zhongshan

AU - Zhang, Haijun

PY - 2024

Y1 - 2024

N2 - Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.

AB - Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.

KW - Unmanned aerial vehicle (UAV)

KW - constrained Markov decision process

KW - deep reinforcement learning

KW - periodic coverage evaluation

KW - primal-dual optimization

UR - http://www.scopus.com/inward/record.url?scp=85202766230&partnerID=8YFLogxK

U2 - 10.1109/TVT.2024.3450956

DO - 10.1109/TVT.2024.3450956

M3 - Article

AN - SCOPUS:85202766230

SN - 0018-9545

VL - 73

SP - 19641

EP - 19652

JO - IEEE Transactions on Vehicular Technology

JF - IEEE Transactions on Vehicular Technology

IS - 12

ER -

Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this