Multi-agent policy learning-based path planning for autonomous mobile robots

Lixiang Zhang, Ze Cai, Yan Yan, Chen Yang*, Yaoguang Hu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

The study addresses path planning problems for autonomous mobile robots (AMRs), considering their kinematics, where performance and responsiveness are often incompatible. This study proposes a multi-agent policy learning-based method to tackle this challenge in dynamic environments. The proposed method features a centralized learning and decentralized execution-based path planning framework designed to meet performance and responsiveness requirements. The problem is modeled as a partial observation Markov Decision Process for policy learning while considering the kinematics using conventional neural networks. Then, an improved proximal policy optimization algorithm is developed with highlight experience replay that corrects failed experiences to speed up the learning processes. The experimental results show that the proposed method outperforms the baselines in both static and dynamic environments. The proposed method shortens the movement distance and time in static environments by about 29.1% and 5.7%, as well as in dynamic environments by about 21.1% and 20.4%, respectively. The runtime is maintained in milliseconds across various environments, taking only 0.07 s. Overall, the proposed method is valid and efficient in ensuring the performance and responsiveness of AMRs when dealing with complex and dynamic path planning problems.

Original languageEnglish
Article number107631
JournalEngineering Applications of Artificial Intelligence
Volume129
DOIs
Publication statusPublished - Mar 2024

Keywords

  • Deep reinforcement learning
  • Dynamics
  • Multi-agent systems
  • Path planning
  • Proximal policy optimization

Fingerprint

Dive into the research topics of 'Multi-agent policy learning-based path planning for autonomous mobile robots'. Together they form a unique fingerprint.

Cite this