TY - JOUR
T1 - Deep Reinforcement Learning for Flocking Motion of Multi-UAV Systems
T2 - Learn From a Digital Twin
AU - Shen, Gaoqing
AU - Lei, Lei
AU - Li, Zhilin
AU - Cai, Shengsuo
AU - Zhang, Lijuan
AU - Cao, Pan
AU - Liu, Xiaojiao
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2022/7/1
Y1 - 2022/7/1
N2 - Over the past decades, unmanned aerial vehicles (UAVs) have been widely used in both military and civilian fields. In these applications, flocking motion is a fundamental but crucial operation of multi-UAV systems. Traditional flocking motion methods usually designed for a specific environment. However, the real environment is mostly unknown and stochastic, which greatly reduces the practicality of these methods. In this article, deep reinforcement learning (DRL) is used to realize the flocking motion of multi-UAV systems. Considering that the sim-to-real problem restricts the application of DRL to the flocking motion scenario, a digital twin (DT)-enabled DRL training framework is proposed to solve this problem. The DRL model can learn from DT and be quickly deployed on the real-world UAV with the help of DT. Under this training framework, this article proposes an actor-critic DRL algorithm, named behavior-coupling deep deterministic policy gradient (BCDDPG), for the flocking motion problem, which is inspired by the flocking behavior of animals. Extensive simulations are conducted to evaluate the performance of BCDDPG. Simulation results show that BCDDPG achieves a higher average reward and performs better in terms of arrival rate and collision rate compared with the existing methods.
AB - Over the past decades, unmanned aerial vehicles (UAVs) have been widely used in both military and civilian fields. In these applications, flocking motion is a fundamental but crucial operation of multi-UAV systems. Traditional flocking motion methods usually designed for a specific environment. However, the real environment is mostly unknown and stochastic, which greatly reduces the practicality of these methods. In this article, deep reinforcement learning (DRL) is used to realize the flocking motion of multi-UAV systems. Considering that the sim-to-real problem restricts the application of DRL to the flocking motion scenario, a digital twin (DT)-enabled DRL training framework is proposed to solve this problem. The DRL model can learn from DT and be quickly deployed on the real-world UAV with the help of DT. Under this training framework, this article proposes an actor-critic DRL algorithm, named behavior-coupling deep deterministic policy gradient (BCDDPG), for the flocking motion problem, which is inspired by the flocking behavior of animals. Extensive simulations are conducted to evaluate the performance of BCDDPG. Simulation results show that BCDDPG achieves a higher average reward and performs better in terms of arrival rate and collision rate compared with the existing methods.
KW - Deep reinforcement learning (DRL)
KW - digital twin (DT)
KW - flocking motion
KW - multi-UAV systems
UR - http://www.scopus.com/inward/record.url?scp=85133277179&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2021.3127873
DO - 10.1109/JIOT.2021.3127873
M3 - Article
AN - SCOPUS:85133277179
SN - 2327-4662
VL - 9
SP - 11141
EP - 11153
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 13
ER -