Cooperative Multiagent Transfer Learning With Coalition Pattern Decomposition

Tianze Zhou, Fubiao Zhang*, Kun Shao, Zipeng Dai, Kai Li, Wenhan Huang, Weixun Wang, Bin Wang, Dong Li, Wulong Liu, Jianye Hao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Knowledge transfer in cooperative multiagent reinforcement learning (MARL) has drawn increasing attention in recent years. Unlike generalizing policies in single-agent tasks, it is more important to consider coordination knowledge than individual knowledge in multiagent transfer learning. However, most of the existing methods only focus on knowledge transfer of the individual agent policy, which leads to coordination bias, and finally, affects the final performance in cooperative MARL. In this article, we propose a level-adaptive MARL framework called 'LA-QTransformer,' to realize the knowledge transfer on the coordination level via efficiently decomposing the agent coordination into multilevel coalition patterns for different agents. Compatible with centralized training with decentralized execution regime, LA-QTransformer utilizes the level-adaptive transformer to generate suitable coalition patterns, and then, realizes the credit assignment for each agent. Besides, to deal with unexpected changes in the number of agents in the coordination transfer phase, we design a policy network called 'population invariant agent with transformer (PIT)' to adapt dynamic observation and action space. We evaluate the LA-QTransformer and PIT in the StarCraft II micromanagement benchmark by comparing them with several state-of-the-art MARL baselines. The experimental results demonstrate the superiority of LA-QTransformer and PIT and verify the feasibility of coordination knowledge transfer.

Original languageEnglish
Pages (from-to)352-364
Number of pages13
JournalIEEE Transactions on Games
Volume16
Issue number2
DOIs
Publication statusPublished - 1 Jun 2024

Keywords

  • Credit assignment
  • multiagent reinforcement learning (MARL)
  • transformer

Fingerprint

Dive into the research topics of 'Cooperative Multiagent Transfer Learning With Coalition Pattern Decomposition'. Together they form a unique fingerprint.

Cite this