TY - GEN
T1 - Distributed Deep Reinforcement Learning for Dynamic Task Scheduling in Multi-Robot Systems
AU - Song, Peng
AU - Xiao, Yichen
AU - Cui, Kaixin
AU - Wang, Junzheng
AU - Shi, Dawei
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - With the increasing task scale and dynamic complexity of multi-robot systems in automated production lines, we design a dynamic distributed task scheduling framework to accelerate the convergence of the combinatorial optimization model, which is suitable for time-varying multi-task and multi-robot scenarios. By integrating an empirical learning model and a teammate collaboration model, a distributed deep reinforcement learning algorithm is formulated with a limited number of workstations. Each workstation is designed as an independent agent, interacting with the environment to learn the current allocation state of robots. Then, a Q-learning network is trained to extract high-dimensional features from the state and optimize the task scheduling policy. Besides, a greedy strategy is incorporated with the Q-learning network to favor actions that show an increasing trend in Q-values, which enables the algorithm to prioritize higher-priority tasks when facing resource limitations. Simulations with three different workload intensities demonstrate that our algorithm achieves a respective enhancement in overall performance of 3.50%, 8.16%, and 3.86% compared with the foundational deep reinforcement learning models.
AB - With the increasing task scale and dynamic complexity of multi-robot systems in automated production lines, we design a dynamic distributed task scheduling framework to accelerate the convergence of the combinatorial optimization model, which is suitable for time-varying multi-task and multi-robot scenarios. By integrating an empirical learning model and a teammate collaboration model, a distributed deep reinforcement learning algorithm is formulated with a limited number of workstations. Each workstation is designed as an independent agent, interacting with the environment to learn the current allocation state of robots. Then, a Q-learning network is trained to extract high-dimensional features from the state and optimize the task scheduling policy. Besides, a greedy strategy is incorporated with the Q-learning network to favor actions that show an increasing trend in Q-values, which enables the algorithm to prioritize higher-priority tasks when facing resource limitations. Simulations with three different workload intensities demonstrate that our algorithm achieves a respective enhancement in overall performance of 3.50%, 8.16%, and 3.86% compared with the foundational deep reinforcement learning models.
KW - distributed deep reinforcement learning
KW - dynamic task scheduling
KW - Multi-robot system
UR - http://www.scopus.com/inward/record.url?scp=105002220865&partnerID=8YFLogxK
U2 - 10.1109/ONCON62778.2024.10931427
DO - 10.1109/ONCON62778.2024.10931427
M3 - Conference contribution
AN - SCOPUS:105002220865
T3 - 2024 IEEE 3rd Industrial Electronics Society Annual On-Line Conference, ONCON 2024
BT - 2024 IEEE 3rd Industrial Electronics Society Annual On-Line Conference, ONCON 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IEEE Industrial Electronics Society Annual On-Line Conference, ONCON 2024
Y2 - 8 December 2024 through 10 December 2024
ER -