Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network

Yunhui Qin; Zhongshan Zhang; Xulong Li; Wei Huangfu; Haijun Zhang

doi:10.1109/TWC.2023.3260304

Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network

Yunhui Qin, Zhongshan Zhang^*, Xulong Li, Wei Huangfu, Haijun Zhang

^*此作品的通讯作者

网络空间安全学院

科研成果: 期刊稿件 › 文章 › 同行评审

28 引用（Scopus）

摘要

In this paper, multi-UAVs serve as mobile aerial ISAC platforms to sense and communicate with on-ground target users. To optimize the communication and sensing performance, we formulate a joint user association, UAV trajectory planning and power allocation problem to maximize the minimum weighted spectral efficiency among UAVs. This paper exploits the centralized and the decentralized deep reinforcement learning (DRL) solutions to solve the sequential decision-making problem. On one hand, we first introduce the centralized soft actor-critic (SAC) algorithm. Then, we explore the equivalent transformation of the optimization objective based on symmetric group, propose the random and the adaptive data augmentation schemes to design the replay memory buffer of SAC, and accordingly propose SAC algorithms assisted by data augmentation to tackle the transformed problem. On the other hand, the multi-agent soft actor-critic (MASAC), a decentralized solution, is also introduced to solve this sequential decision-making problem. The experiment results reveal the effectiveness of the centralized and the decentralized solutions in considered scenarios. Specifically, the SAC assisted by the adaptive scheme significantly outperforms other centralized solutions in the training speed and the weighted spectral efficiency. Meanwhile, the decentralized MASAC algorithm behaves best in the early training speed.

源语言	英语
页（从-至）	8158-8169
页数	12
期刊	IEEE Transactions on Wireless Communications
卷	22
期	11
DOI	https://doi.org/10.1109/TWC.2023.3260304
出版状态	已出版 - 1 11月 2023

访问文件

10.1109/TWC.2023.3260304

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{5873d4077d1342359442e2e7ccb82986,

title = "Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network",

abstract = "In this paper, multi-UAVs serve as mobile aerial ISAC platforms to sense and communicate with on-ground target users. To optimize the communication and sensing performance, we formulate a joint user association, UAV trajectory planning and power allocation problem to maximize the minimum weighted spectral efficiency among UAVs. This paper exploits the centralized and the decentralized deep reinforcement learning (DRL) solutions to solve the sequential decision-making problem. On one hand, we first introduce the centralized soft actor-critic (SAC) algorithm. Then, we explore the equivalent transformation of the optimization objective based on symmetric group, propose the random and the adaptive data augmentation schemes to design the replay memory buffer of SAC, and accordingly propose SAC algorithms assisted by data augmentation to tackle the transformed problem. On the other hand, the multi-agent soft actor-critic (MASAC), a decentralized solution, is also introduced to solve this sequential decision-making problem. The experiment results reveal the effectiveness of the centralized and the decentralized solutions in considered scenarios. Specifically, the SAC assisted by the adaptive scheme significantly outperforms other centralized solutions in the training speed and the weighted spectral efficiency. Meanwhile, the decentralized MASAC algorithm behaves best in the early training speed.",

keywords = "Integrated sensing and communications (ISAC), deep reinforcement learning, power allocation, trajectory planning, unmanned aerial vehicle (UAV)",

author = "Yunhui Qin and Zhongshan Zhang and Xulong Li and Wei Huangfu and Haijun Zhang",

note = "Publisher Copyright: {\textcopyright} 2002-2012 IEEE.",

year = "2023",

month = nov,

day = "1",

doi = "10.1109/TWC.2023.3260304",

language = "English",

volume = "22",

pages = "8158--8169",

journal = "IEEE Transactions on Wireless Communications",

issn = "1536-1276",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "11",

}

TY - JOUR

T1 - Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network

AU - Qin, Yunhui

AU - Zhang, Zhongshan

AU - Li, Xulong

AU - Huangfu, Wei

AU - Zhang, Haijun

PY - 2023/11/1

Y1 - 2023/11/1

N2 - In this paper, multi-UAVs serve as mobile aerial ISAC platforms to sense and communicate with on-ground target users. To optimize the communication and sensing performance, we formulate a joint user association, UAV trajectory planning and power allocation problem to maximize the minimum weighted spectral efficiency among UAVs. This paper exploits the centralized and the decentralized deep reinforcement learning (DRL) solutions to solve the sequential decision-making problem. On one hand, we first introduce the centralized soft actor-critic (SAC) algorithm. Then, we explore the equivalent transformation of the optimization objective based on symmetric group, propose the random and the adaptive data augmentation schemes to design the replay memory buffer of SAC, and accordingly propose SAC algorithms assisted by data augmentation to tackle the transformed problem. On the other hand, the multi-agent soft actor-critic (MASAC), a decentralized solution, is also introduced to solve this sequential decision-making problem. The experiment results reveal the effectiveness of the centralized and the decentralized solutions in considered scenarios. Specifically, the SAC assisted by the adaptive scheme significantly outperforms other centralized solutions in the training speed and the weighted spectral efficiency. Meanwhile, the decentralized MASAC algorithm behaves best in the early training speed.

AB - In this paper, multi-UAVs serve as mobile aerial ISAC platforms to sense and communicate with on-ground target users. To optimize the communication and sensing performance, we formulate a joint user association, UAV trajectory planning and power allocation problem to maximize the minimum weighted spectral efficiency among UAVs. This paper exploits the centralized and the decentralized deep reinforcement learning (DRL) solutions to solve the sequential decision-making problem. On one hand, we first introduce the centralized soft actor-critic (SAC) algorithm. Then, we explore the equivalent transformation of the optimization objective based on symmetric group, propose the random and the adaptive data augmentation schemes to design the replay memory buffer of SAC, and accordingly propose SAC algorithms assisted by data augmentation to tackle the transformed problem. On the other hand, the multi-agent soft actor-critic (MASAC), a decentralized solution, is also introduced to solve this sequential decision-making problem. The experiment results reveal the effectiveness of the centralized and the decentralized solutions in considered scenarios. Specifically, the SAC assisted by the adaptive scheme significantly outperforms other centralized solutions in the training speed and the weighted spectral efficiency. Meanwhile, the decentralized MASAC algorithm behaves best in the early training speed.

KW - Integrated sensing and communications (ISAC)

KW - deep reinforcement learning

KW - power allocation

KW - trajectory planning

KW - unmanned aerial vehicle (UAV)

UR - http://www.scopus.com/inward/record.url?scp=85151534164&partnerID=8YFLogxK

U2 - 10.1109/TWC.2023.3260304

DO - 10.1109/TWC.2023.3260304

M3 - Article

AN - SCOPUS:85151534164

SN - 1536-1276

VL - 22

SP - 8158

EP - 8169

JO - IEEE Transactions on Wireless Communications

JF - IEEE Transactions on Wireless Communications

IS - 11

ER -

Deep Reinforcement Learning Based Resource Allocation and Trajectory Planning in Integrated Sensing and Communications UAV Network

摘要

访问文件

其它文件与链接

指纹

引用此