TY - GEN
T1 - Research on Multi-Agent Task Allocation and Path Planning Based on Pri-MADDPG
AU - Wang, Zhiwen
AU - Wang, Bo
AU - He, Xiao
AU - Fei, Qing
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this paper, we aim to develop a reinforcement learning (RL) based algorithm for the task allocation and path planning problem of multi-agent systems where all agents autonomously head to task points with obstacle avoidance. To address the challenge of slow convergence speed and insufficient reward setting when using traditional RL methods, the named Pri-MADDPG algorithm based on prioritized experience replay is proposed. By integrating task allocation and path planning problem, we first construct a framework for multi-agent reinforcement learning training by designing essential elements including appropriate observation space, action space, and reward functions. Then a prioritized experience replay method, in which the value network loss is employed for the priority evaluation, is utilized to enhance policy learning performance. A reward mechanism is further improved through taking into consideration of both global task objectives and individual objectives. To verify the effectiveness of Pri-MADDPG algorithm, experiments are finally carried out with the well-designed reward mechanism. The results demonstrate that all agents can autonomously accomplish task allocation with smooth and highly safe trajectories while achieving faster convergence speed, better stability, and superior performance.
AB - In this paper, we aim to develop a reinforcement learning (RL) based algorithm for the task allocation and path planning problem of multi-agent systems where all agents autonomously head to task points with obstacle avoidance. To address the challenge of slow convergence speed and insufficient reward setting when using traditional RL methods, the named Pri-MADDPG algorithm based on prioritized experience replay is proposed. By integrating task allocation and path planning problem, we first construct a framework for multi-agent reinforcement learning training by designing essential elements including appropriate observation space, action space, and reward functions. Then a prioritized experience replay method, in which the value network loss is employed for the priority evaluation, is utilized to enhance policy learning performance. A reward mechanism is further improved through taking into consideration of both global task objectives and individual objectives. To verify the effectiveness of Pri-MADDPG algorithm, experiments are finally carried out with the well-designed reward mechanism. The results demonstrate that all agents can autonomously accomplish task allocation with smooth and highly safe trajectories while achieving faster convergence speed, better stability, and superior performance.
KW - path planning
KW - prioritized experience replay
KW - reinforcement learning
KW - task allocation
UR - http://www.scopus.com/inward/record.url?scp=85189372716&partnerID=8YFLogxK
U2 - 10.1109/CAC59555.2023.10452082
DO - 10.1109/CAC59555.2023.10452082
M3 - Conference contribution
AN - SCOPUS:85189372716
T3 - Proceedings - 2023 China Automation Congress, CAC 2023
SP - 6569
EP - 6574
BT - Proceedings - 2023 China Automation Congress, CAC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 China Automation Congress, CAC 2023
Y2 - 17 November 2023 through 19 November 2023
ER -