Multi-UAV Cooperative Search Based on Reinforcement Learning With a Digital Twin Driven Training Framework

Gaoqing Shen, Lei Lei*, Xinting Zhang, Zhilin Li, Shengsuo Cai, Lijuan Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)

Abstract

This paper considers the cooperative search for stationary targets by multiple unmanned aerial vehicles (UAVs) with limited sensing range and communication ability in a dynamic threatening environment. The main purpose is to use multiple UAVs to find more unknown targets as soon as possible, increase the coverage rate of the mission area, and more importantly, guide UAVs away from threats. However, traditional search methods are mostly unscalable and perform poorly in dynamic environments. A new multi-agent deep reinforcement learning (MADRL) method, DNQMIX, is proposed in this study to solve the multi-UAV cooperative target search (MCTS) problem. The reward function is also newly designed for the MCTS problem to guide UAVs to explore and exploit the environment information more efficiently. Moreover, this paper proposes a digital twin (DT) driven training framework 'centralized training, decentralized execution, and continuous evolution' (CTDECE). It can facilitate the continuous evolution of MADRL models and solve the tradeoff between training speed and environment fidelity when MADRL is applied to real-world multi-UAV systems. Simulation results show that DNQMIX outperforms state-of-art methods in terms of search rate and coverage rate.

Original languageEnglish
Pages (from-to)8354-8368
Number of pages15
JournalIEEE Transactions on Vehicular Technology
Volume72
Issue number7
DOIs
Publication statusPublished - 1 Jul 2023
Externally publishedYes

Keywords

  • Cooperative target search
  • digital twin
  • multi-agent deep reinforcement learning
  • unmanned aerial vehicles

Fingerprint

Dive into the research topics of 'Multi-UAV Cooperative Search Based on Reinforcement Learning With a Digital Twin Driven Training Framework'. Together they form a unique fingerprint.

Cite this