TY - JOUR
T1 - Meta-learning for dynamic multi-robot task scheduling
AU - Song, Peng
AU - Chen, Huaiyu
AU - Cui, Kaixin
AU - Wang, Junzheng
AU - Shi, Dawei
N1 - Publisher Copyright:
© 2025
PY - 2025/10
Y1 - 2025/10
N2 - In this work, we investigate the problem of dynamic task scheduling for multi-robot systems, in which a large number of robots collaborate to achieve a multi-objective optimization goal in transportation, rescue, etc. Considering the dynamic characteristics of tasks and robots in industrial scenarios, a reinforcement learning scheduling algorithm based on a meta-learning framework is proposed, which learns to interact with the environment to obtain an optimal solution. A DenseNet-like deep Q-network is designed to mine high level features of a state matrix, whose size changes dynamically with the scenario settings. By optimizing network parameters in inner and outer meta learning loops, the Q-network learns from the experience of multiple scheduling scenarios and obtains a generalized initialization parameter, which can be fine-tuned online to adapt to a new multi-robot system. The effectiveness of the proposed meta-scheduling approach is illustrated by numerical simulations in 9 different multi robot scenarios, achieving a 11.0% higher objective score and a 63.9% reduction in training time compared with a standard deep Q-Learning algorithm.
AB - In this work, we investigate the problem of dynamic task scheduling for multi-robot systems, in which a large number of robots collaborate to achieve a multi-objective optimization goal in transportation, rescue, etc. Considering the dynamic characteristics of tasks and robots in industrial scenarios, a reinforcement learning scheduling algorithm based on a meta-learning framework is proposed, which learns to interact with the environment to obtain an optimal solution. A DenseNet-like deep Q-network is designed to mine high level features of a state matrix, whose size changes dynamically with the scenario settings. By optimizing network parameters in inner and outer meta learning loops, the Q-network learns from the experience of multiple scheduling scenarios and obtains a generalized initialization parameter, which can be fine-tuned online to adapt to a new multi-robot system. The effectiveness of the proposed meta-scheduling approach is illustrated by numerical simulations in 9 different multi robot scenarios, achieving a 11.0% higher objective score and a 63.9% reduction in training time compared with a standard deep Q-Learning algorithm.
KW - Meta learning
KW - Multi-robot system
KW - Reinforcement learning
KW - Task scheduling
UR - http://www.scopus.com/inward/record.url?scp=105004663469&partnerID=8YFLogxK
U2 - 10.1016/j.cor.2025.107109
DO - 10.1016/j.cor.2025.107109
M3 - Article
AN - SCOPUS:105004663469
SN - 0305-0548
VL - 182
JO - Computers and Operations Research
JF - Computers and Operations Research
M1 - 107109
ER -