TY - JOUR
T1 - Cooperative Multiagent Transfer Learning With Coalition Pattern Decomposition
AU - Zhou, Tianze
AU - Zhang, Fubiao
AU - Shao, Kun
AU - Dai, Zipeng
AU - Li, Kai
AU - Huang, Wenhan
AU - Wang, Weixun
AU - Wang, Bin
AU - Li, Dong
AU - Liu, Wulong
AU - Hao, Jianye
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2024/6/1
Y1 - 2024/6/1
N2 - Knowledge transfer in cooperative multiagent reinforcement learning (MARL) has drawn increasing attention in recent years. Unlike generalizing policies in single-agent tasks, it is more important to consider coordination knowledge than individual knowledge in multiagent transfer learning. However, most of the existing methods only focus on knowledge transfer of the individual agent policy, which leads to coordination bias, and finally, affects the final performance in cooperative MARL. In this article, we propose a level-adaptive MARL framework called 'LA-QTransformer,' to realize the knowledge transfer on the coordination level via efficiently decomposing the agent coordination into multilevel coalition patterns for different agents. Compatible with centralized training with decentralized execution regime, LA-QTransformer utilizes the level-adaptive transformer to generate suitable coalition patterns, and then, realizes the credit assignment for each agent. Besides, to deal with unexpected changes in the number of agents in the coordination transfer phase, we design a policy network called 'population invariant agent with transformer (PIT)' to adapt dynamic observation and action space. We evaluate the LA-QTransformer and PIT in the StarCraft II micromanagement benchmark by comparing them with several state-of-the-art MARL baselines. The experimental results demonstrate the superiority of LA-QTransformer and PIT and verify the feasibility of coordination knowledge transfer.
AB - Knowledge transfer in cooperative multiagent reinforcement learning (MARL) has drawn increasing attention in recent years. Unlike generalizing policies in single-agent tasks, it is more important to consider coordination knowledge than individual knowledge in multiagent transfer learning. However, most of the existing methods only focus on knowledge transfer of the individual agent policy, which leads to coordination bias, and finally, affects the final performance in cooperative MARL. In this article, we propose a level-adaptive MARL framework called 'LA-QTransformer,' to realize the knowledge transfer on the coordination level via efficiently decomposing the agent coordination into multilevel coalition patterns for different agents. Compatible with centralized training with decentralized execution regime, LA-QTransformer utilizes the level-adaptive transformer to generate suitable coalition patterns, and then, realizes the credit assignment for each agent. Besides, to deal with unexpected changes in the number of agents in the coordination transfer phase, we design a policy network called 'population invariant agent with transformer (PIT)' to adapt dynamic observation and action space. We evaluate the LA-QTransformer and PIT in the StarCraft II micromanagement benchmark by comparing them with several state-of-the-art MARL baselines. The experimental results demonstrate the superiority of LA-QTransformer and PIT and verify the feasibility of coordination knowledge transfer.
KW - Credit assignment
KW - multiagent reinforcement learning (MARL)
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85159846861&partnerID=8YFLogxK
U2 - 10.1109/TG.2023.3272386
DO - 10.1109/TG.2023.3272386
M3 - Article
AN - SCOPUS:85159846861
SN - 2475-1502
VL - 16
SP - 352
EP - 364
JO - IEEE Transactions on Games
JF - IEEE Transactions on Games
IS - 2
ER -