TY - JOUR
T1 - The Dynamic Response of Dual Cellular-Connected UAVs for Random Real-Time Communication Requests from Multiple Hotspots
T2 - A Deep Reinforcement Learning Approach
AU - Yang, Shengzhi
AU - Zhou, Jianming
AU - Meng, Xiao
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/11
Y1 - 2024/11
N2 - It is gradually becoming popular to use multiple cellular-connected UAVs as inspectors to fulfill automatic surveillance and monitoring. However, in actual situations, UAVs should respond to several service requests from different hotspots, whilst the requests usually present randomness in the arrival time, data amount, and the concurrency. This paper proposes a dynamic dual-UAV response policy for multi-hotspot services based on single-agent deep Q-learning, where the UAVs controlled by a ground base station can be dispatched automatically to hotspots and then send videos back. First, this issue is formulated as an optimization problem, whose goal is to maximize the number of successfully served requests with the constraints of both the UAV’s energy limit and request waiting time. Second, a reward function based on service completion is designed to overcome the potential challenges posed by the delay reward. Finally, a simulation was conducted, comparing the conventional time priority algorithm and distance priority algorithm, respectively, to the proposed algorithm. The results illustrate that the proposed algorithm can achieve one more response than the others under different service densities, with the lowest failure number and appropriate average waiting time. This method can give a technical solution for the joint communication-and-control problem of multiple UAVs within complex situations.
AB - It is gradually becoming popular to use multiple cellular-connected UAVs as inspectors to fulfill automatic surveillance and monitoring. However, in actual situations, UAVs should respond to several service requests from different hotspots, whilst the requests usually present randomness in the arrival time, data amount, and the concurrency. This paper proposes a dynamic dual-UAV response policy for multi-hotspot services based on single-agent deep Q-learning, where the UAVs controlled by a ground base station can be dispatched automatically to hotspots and then send videos back. First, this issue is formulated as an optimization problem, whose goal is to maximize the number of successfully served requests with the constraints of both the UAV’s energy limit and request waiting time. Second, a reward function based on service completion is designed to overcome the potential challenges posed by the delay reward. Finally, a simulation was conducted, comparing the conventional time priority algorithm and distance priority algorithm, respectively, to the proposed algorithm. The results illustrate that the proposed algorithm can achieve one more response than the others under different service densities, with the lowest failure number and appropriate average waiting time. This method can give a technical solution for the joint communication-and-control problem of multiple UAVs within complex situations.
KW - communication service
KW - deep Q-learning
KW - dynamic response
KW - energy constraints
KW - multi-UAV
KW - UAV surveillance
UR - http://www.scopus.com/inward/record.url?scp=85208531709&partnerID=8YFLogxK
U2 - 10.3390/electronics13214181
DO - 10.3390/electronics13214181
M3 - Article
AN - SCOPUS:85208531709
SN - 2079-9292
VL - 13
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 21
M1 - 4181
ER -