TY - JOUR
T1 - Multi-Vehicle Cooperative Persistent Coverage for Random Target Search
AU - Li, Zhuo
AU - Li, Guangzheng
AU - Sadeghi, Alireza
AU - Sun, Jian
AU - Wang, Gang
AU - Wang, Jialin
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This letter investigates the target search problem for a network of autonomous vehicles, aiming to maximize the detection of randomly appearing targets within a given area. Considering no prior knowledge of the targets is available, we propose a multi-vehicle cooperative persistent coverage scheme under the framework of multi-agent reinforcement learning, in contrast to heuristic and model-based optimization methods in existing works. We model the persistent coverage problem as a partially observable Markov decision process (POMDP) due to the vehicles' limited observation ranges, and introduce a knowability map to characterize their knowledge of the target area. Each vehicle employs a distributed estimator, leveraging its own observations and shared information from neighboring vehicles, to construct a globally estimated knowability map - thereby mitigating partial observability. The persistent coverage policies are learned with the architecture of centralized training and distributed execution, enabling cooperative and efficient target search by fully exploiting shared information. Moreover, we propose an adaptive partition method for the target area to ensure a fixed dimension of the state space in the POMDP, which can improve scalability of the learned policy to target areas with various sizes. Simulations validate effectiveness and scalability of the proposed cooperative scheme.
AB - This letter investigates the target search problem for a network of autonomous vehicles, aiming to maximize the detection of randomly appearing targets within a given area. Considering no prior knowledge of the targets is available, we propose a multi-vehicle cooperative persistent coverage scheme under the framework of multi-agent reinforcement learning, in contrast to heuristic and model-based optimization methods in existing works. We model the persistent coverage problem as a partially observable Markov decision process (POMDP) due to the vehicles' limited observation ranges, and introduce a knowability map to characterize their knowledge of the target area. Each vehicle employs a distributed estimator, leveraging its own observations and shared information from neighboring vehicles, to construct a globally estimated knowability map - thereby mitigating partial observability. The persistent coverage policies are learned with the architecture of centralized training and distributed execution, enabling cooperative and efficient target search by fully exploiting shared information. Moreover, we propose an adaptive partition method for the target area to ensure a fixed dimension of the state space in the POMDP, which can improve scalability of the learned policy to target areas with various sizes. Simulations validate effectiveness and scalability of the proposed cooperative scheme.
KW - Cooperative search
KW - multi-vehicle system
KW - persistent coverage
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=105004806847&partnerID=8YFLogxK
U2 - 10.1109/LRA.2025.3568563
DO - 10.1109/LRA.2025.3568563
M3 - Article
AN - SCOPUS:105004806847
SN - 2377-3766
VL - 10
SP - 6680
EP - 6687
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 7
ER -