TY - GEN
T1 - Multi-agent Reinforcement Learning for Sparse Reward Tasks Using Incremental Goal Enhanced Method
AU - Han, Minglei
AU - Guo, Zhentao
AU - Sun, Licheng
AU - Ding, Ao
AU - Wang, Tianhao
AU - Zhao, Guiyu
AU - Ma, Hongbin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - As the application of artificial intelligence continues to expand, complex decision-making problems such as multi-player gaming, multi-robot planning and multi-vehicle controlling have become new challenges for machine intelligence. MARL which concentrates on learning the optimal strategies of multiple agents that coexist in a shared environment, is a valid method to solve multi-agent decision-making challenges. Among MARL Algorithms, the MAPPO algorithm has won the favor of machine learning community due to its superb performance. However, the original MAPPO algorithm suffers from sparse reward issues. To overcome the sparse rewards problem and achieve sufficient learning in complex task, this paper proposes a IGE-MAPPO which uses a IGM that generates a variable-density and bi-domain reward signal, and conducts experiments on SMAC. The results show that the IGE-MAPPO algorithm can adapt to a variety of complex environment and has improved performance compared with other typical MARL algorithms.
AB - As the application of artificial intelligence continues to expand, complex decision-making problems such as multi-player gaming, multi-robot planning and multi-vehicle controlling have become new challenges for machine intelligence. MARL which concentrates on learning the optimal strategies of multiple agents that coexist in a shared environment, is a valid method to solve multi-agent decision-making challenges. Among MARL Algorithms, the MAPPO algorithm has won the favor of machine learning community due to its superb performance. However, the original MAPPO algorithm suffers from sparse reward issues. To overcome the sparse rewards problem and achieve sufficient learning in complex task, this paper proposes a IGE-MAPPO which uses a IGM that generates a variable-density and bi-domain reward signal, and conducts experiments on SMAC. The results show that the IGE-MAPPO algorithm can adapt to a variety of complex environment and has improved performance compared with other typical MARL algorithms.
KW - Incremental Goals
KW - Multi-Agent Proximal Policy Optimization
KW - Sparse Rewards
UR - http://www.scopus.com/inward/record.url?scp=105003904660&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-4756-9_7
DO - 10.1007/978-981-96-4756-9_7
M3 - Conference contribution
AN - SCOPUS:105003904660
SN - 9789819647552
T3 - Communications in Computer and Information Science
SP - 74
EP - 85
BT - Computational Intelligence and Industrial Applications - 11th International Symposium, ISCIIA 2024, Proceedings
A2 - Xin, Bin
A2 - Ma, Hongbin
A2 - She, Jinhua
A2 - Cao, Weihua
PB - Springer Science and Business Media Deutschland GmbH
T2 - 11th International Symposium on Computational Intelligence and Industrial Applications, ISCIIA 2024
Y2 - 1 November 2024 through 5 November 2024
ER -