TY - GEN
T1 - Three-dimensional trajectory design for multi-user MISO UAV communications
T2 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
AU - Wang, Yang
AU - Gao, Zhen
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/28
Y1 - 2021/7/28
N2 - In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.
AB - In this paper, we investigate a multi-user downlink multiple-input single-output (MISO) unmanned aerial vehicle (UAV) communication system, where a multi-antenna UAV is employed to serve multiple ground terminals. Unlike existing approaches focus only on a simplified two-dimensional scenario, this paper considers a three-dimensional (3D) urban environment, where the UAV's 3D trajectory is designed to minimize data transmission completion time subject to practical throughput and flight movement constraints. Specifically, we propose a deep reinforcement learning (DRL)-based trajectory design for completion time minimization (DRL- TDCTM), which is developed from a deep deterministic policy gradient algorithm. In particular, to represent the state information of UAV and environment, we set an additional information, i.e., the merged pheromone, as a reference of reward which facilitates the algorithm design. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. Finally, simulation results show the superiority of the proposed DRL- TDCTM algorithm over the conventional baseline methods.
KW - 3D trajectory design
KW - Deep reinforcement learning
KW - Multi-antenna UAV
KW - UAV communication systems
UR - http://www.scopus.com/inward/record.url?scp=85119342524&partnerID=8YFLogxK
U2 - 10.1109/ICCC52777.2021.9580401
DO - 10.1109/ICCC52777.2021.9580401
M3 - Conference contribution
AN - SCOPUS:85119342524
T3 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
SP - 706
EP - 711
BT - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 July 2021 through 30 July 2021
ER -