Abstract
In contemporary Multi-Agent Reinforcement Learning (MARL), effectively enhancing the expressive capacity of value functions has been a persistent research focus. Many studies have employed value decomposition methods; however, due to the neglect of inter-agent collaboration, these methods fall short of achieving optimal performance. Subsequent research introduced coordination graphs into value decomposition methods; nevertheless, these approaches often rely on simplistic rules to evaluate inter-agent collaboration and fail to adequately describe the collaborative relationships of agents in complex environments. Consequently, we propose Influence Enhanced Sparse Coordination Graphs (IESCG) as a solution to provide insights into the aforementioned problem. In this study, we propose influence networks as quantitative descriptions of the importance of collaboration among agents, utilizing them as crucial basis for constructing the topology of Sparse Time-Varying Coordination Graphs. Additionally, we propose Recurrent Payoff Function Networks (RPFN) to incorporate temporal information while providing necessary input to influence networks. Furthermore, Sparse Graph Advantage Selection Coefficients (SGASC) are introduced to stabilize the overall value function across different time steps, ensuring training stability. Experimental investigations conducted on the StarCraft II micromanagement and MACO benchmark indicate that our algorithm not only accelerates convergence and improves winning probabilities but also exhibits more pronounced advantages in complex scenarios.
Original language | English |
---|---|
Article number | 107454 |
Journal | Neural Networks |
Volume | 188 |
DOIs | |
Publication status | Published - Aug 2025 |
Externally published | Yes |
Keywords
- Coordination graphs
- Decentralized partially observable Markov decision process (Dec-POMDP)
- Multi-agent reinforcement learning
- Q-learning