OMA-QMIX: Exploring Optimal Multi-Agent Reinforcement Learning Framework in Multi-Action Spaces

Licheng Sun*, Hui Chen, Zhentao Guo, Tianhao Wang, Ao Ding, Hongbin Ma

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the real world, many tasks involve multiple agents, such as swarm robotics, drone swarm control, and autonomous vehicle coordination, all of which can be modeled as Multi-Agent Reinforcement Learning (MARL) tasks. Several methods, such as QMIX, have been proposed to address credit assignment problems and learn cooperative strategies in MARL. The latest variant of the state-of-the-art MARL algorithm QMIX aims to relax QMIX's monotonicity constraint to improve SMAC's performance. However, these methods still lack thorough exploration, and agents struggle to identify states worthy of exploration, making it challenging to coordinate exploration efforts on these states and leading to suboptimal policies. In this paper, we propose an exploration-oriented Multi-Agent Reinforcement Learning framework called OMA-QMIX, where agents set independent entropy temperatures for each agent during exploration, selecting targets from multiple projected state spaces to explore action spaces and approximate total state values. Additionally, we utilize Transformers to capture relationships and information from other agents for exploitation, fostering coordination with the rest of the agents. Experimental results demonstrate that OMA-QMIX significantly outperforms state-of-the-art algorithms in the StarCraft Multi-Agent Challenge. Particularly on SMAC tasks, OMAQMIX achieves a success rate of 100% on almost all Hard Maps and Super Hard Maps.

Original languageEnglish
Title of host publicationProceedings of the 43rd Chinese Control Conference, CCC 2024
EditorsJing Na, Jian Sun
PublisherIEEE Computer Society
Pages8194-8199
Number of pages6
ISBN (Electronic)9789887581581
DOIs
Publication statusPublished - 2024
Event43rd Chinese Control Conference, CCC 2024 - Kunming, China
Duration: 28 Jul 202431 Jul 2024

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference43rd Chinese Control Conference, CCC 2024
Country/TerritoryChina
CityKunming
Period28/07/2431/07/24

Keywords

  • Deep Learning
  • Multi-Agent
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'OMA-QMIX: Exploring Optimal Multi-Agent Reinforcement Learning Framework in Multi-Action Spaces'. Together they form a unique fingerprint.

Cite this