Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach

Yang Wang; Zhen Gao

doi:10.1109/ICCC52777.2021.9580401

Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach

Yang Wang, Zhen Gao

前沿交叉科学研究院

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

4 引用（Scopus）

摘要

In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.

源语言	英语
主期刊名	2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
出版商	Institute of Electrical and Electronics Engineers Inc.
页	706-711
页数	6
ISBN（电子版）	9781665443852
DOI	https://doi.org/10.1109/ICCC52777.2021.9580401
出版状态	已出版 - 28 7月 2021
活动	2021 IEEE/CIC International Conference on Communications in China, ICCC 2021 - Xiamen, 中国期限: 28 7月 2021 → 30 7月 2021

出版系列

姓名	2021 IEEE/CIC International Conference on Communications in China, ICCC 2021

会议

会议	2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
国家/地区	中国
市	Xiamen
时期	28/07/21 → 30/07/21

访问文件

10.1109/ICCC52777.2021.9580401

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, Y., & Gao, Z. (2021). Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach. 在 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021 (页码 706-711). (2021 IEEE/CIC International Conference on Communications in China, ICCC 2021). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCC52777.2021.9580401

Wang, Yang ; Gao, Zhen. / Three-dimensional trajectory design for multi-user MISO UAV communications : A deep reinforcement learning approach. 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. 页码 706-711 (2021 IEEE/CIC International Conference on Communications in China, ICCC 2021).

@inproceedings{fac1e2eea3f443b785ae4cfe3a3efd79,

title = "Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach",

abstract = "In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.",

keywords = "3D trajectory design, Deep reinforcement learning, Multi-antenna UAV, UAV communication systems",

author = "Yang Wang and Zhen Gao",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021 ; Conference date: 28-07-2021 Through 30-07-2021",

year = "2021",

month = jul,

day = "28",

doi = "10.1109/ICCC52777.2021.9580401",

language = "English",

series = "2021 IEEE/CIC International Conference on Communications in China, ICCC 2021",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "706--711",

booktitle = "2021 IEEE/CIC International Conference on Communications in China, ICCC 2021",

address = "United States",

}

Wang, Y & Gao, Z 2021, Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach. 在 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021. 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021, Institute of Electrical and Electronics Engineers Inc., 页码 706-711, 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021, Xiamen, 中国, 28/07/21. https://doi.org/10.1109/ICCC52777.2021.9580401

Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach. / Wang, Yang; Gao, Zhen.
2021 IEEE/CIC International Conference on Communications in China, ICCC 2021. Institute of Electrical and Electronics Engineers Inc., 2021. 页码 706-711 (2021 IEEE/CIC International Conference on Communications in China, ICCC 2021).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Three-dimensional trajectory design for multi-user MISO UAV communications

T2 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021

AU - Wang, Yang

AU - Gao, Zhen

PY - 2021/7/28

Y1 - 2021/7/28

N2 - In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.

AB - In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.

KW - 3D trajectory design

KW - Deep reinforcement learning

KW - Multi-antenna UAV

KW - UAV communication systems

UR - http://www.scopus.com/inward/record.url?scp=85119342524&partnerID=8YFLogxK

U2 - 10.1109/ICCC52777.2021.9580401

DO - 10.1109/ICCC52777.2021.9580401

M3 - Conference contribution

AN - SCOPUS:85119342524

T3 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021

SP - 706

EP - 711

BT - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 28 July 2021 through 30 July 2021

ER -

Wang Y, Gao Z. Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach. 在 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021. Institute of Electrical and Electronics Engineers Inc. 2021. 页码 706-711. (2021 IEEE/CIC International Conference on Communications in China, ICCC 2021). doi: 10.1109/ICCC52777.2021.9580401

Three-dimensional trajectory design for multi-user MISO UAV communications: A deep reinforcement learning approach

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此