TY - GEN
T1 - HiMacMic
T2 - 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
AU - Zhang, Hancheng
AU - Li, Guozheng
AU - Liu, Chi Harold
AU - Wang, Guoren
AU - Tang, Jian
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/8/6
Y1 - 2023/8/6
N2 - Multi-agent deep reinforcement learning (MADRL) has been widely used in many scenarios such as robotics and game AI. However, existing methods mainly focus on the optimization of agents' micro policies without considering the macro strategy. As a result, they cannot perform well in complex or sparse reward scenarios like the StarCraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF). To this end, we propose a hierarchical MADRL framework called "HiMacMic"with dynamic asynchronous macro strategy. Spatially, HiMacMic determines a critical position by using a positional heat map. Temporally, the macro strategy dynamically decides its deadline and updates it asynchronously among agents. We validate HiMacMic in four widely used benchmarks, namely: Overcooked, GRF, SMAC and SMAC-v2 with nine chosen scenarios. Results show that HiMacMic not only converges faster and achieves higher results than ten existing approaches, but also shows its adaptability to different environment settings.
AB - Multi-agent deep reinforcement learning (MADRL) has been widely used in many scenarios such as robotics and game AI. However, existing methods mainly focus on the optimization of agents' micro policies without considering the macro strategy. As a result, they cannot perform well in complex or sparse reward scenarios like the StarCraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF). To this end, we propose a hierarchical MADRL framework called "HiMacMic"with dynamic asynchronous macro strategy. Spatially, HiMacMic determines a critical position by using a positional heat map. Temporally, the macro strategy dynamically decides its deadline and updates it asynchronously among agents. We validate HiMacMic in four widely used benchmarks, namely: Overcooked, GRF, SMAC and SMAC-v2 with nine chosen scenarios. Results show that HiMacMic not only converges faster and achieves higher results than ten existing approaches, but also shows its adaptability to different environment settings.
KW - macro strategy
KW - multi-agent deep reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85171346810&partnerID=8YFLogxK
U2 - 10.1145/3580305.3599379
DO - 10.1145/3580305.3599379
M3 - Conference contribution
AN - SCOPUS:85171346810
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 3239
EP - 3248
BT - KDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 6 August 2023 through 10 August 2023
ER -