Mamba-Driven Strategy for Multi-Agent Games: Towards a Stable Cooperative Value Framework

  • Jinming Qi
  • , Pengyuan Min
  • , Zhaohan Feng*
  • , Yuzhou Wei
  • , Gang Wang
  • , Jian Sun
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multi-agent reinforcement learning (MARL) has shown great potential in addressing collaborative issues within multiagent systems (MAS), particularly in domains such as autonomous driving, robotic collaboration, and network systems. This study introduces an innovative MARL algorithm, MambaCoV, which focuses on tackling the challenges of complex interactions between agents and adapting to dynamic environments. MambaCoV employs a constrained communication framework that facilitates the exchange of essential information among agents, allowing the coordination of actions and the collective optimization of overall performance. Through this mechanism, agents can fairly assess each other's contributions and allocate rewards accordingly, thereby reducing the variance in policy gradient estimation and enhancing the learning efficiency of multi-agent systems. In policy learning, we introduce a network architecture centered around the Mamba model, designed to effectively utilize historical information. Experimental evaluations in cooperative navigation and Starcraft Multi-Agent Challenge (SMAC) demonstrate that MambaCoV exhibits faster adaptability and more stable performance improvements in the early stages of training compared to existing advanced MARL methods. This result confirms that effective communication and coordination between agents are key factors in improving performance. Moreover, through precise credit assignment, MambaCoV ensures that each agent's contributions are reasonably assessed, enhancing the efficiency and stability of team collaboration.

Original languageEnglish
Title of host publicationProceedings of the 44th Chinese Control Conference, CCC 2025
EditorsJian Sun, Hongpeng Yin
PublisherIEEE Computer Society
Pages6137-6142
Number of pages6
ISBN (Electronic)9789887581611
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event44th Chinese Control Conference, CCC 2025 - Chongqing, China
Duration: 28 Jul 202530 Jul 2025

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference44th Chinese Control Conference, CCC 2025
Country/TerritoryChina
CityChongqing
Period28/07/2530/07/25

Keywords

  • Communication mechanism
  • Credit assignment
  • Multi-agent reinforcement learning

Fingerprint

Dive into the research topics of 'Mamba-Driven Strategy for Multi-Agent Games: Towards a Stable Cooperative Value Framework'. Together they form a unique fingerprint.

Cite this