OB-HPPO: An Option and Intrinsic Curiosity Based Hierarchical Reinforcement Learning Approach for Real-Time Strategy Games

  • Ruilin Jiang
  • , Yanlong Zhai*
  • , Yan Zheng
  • , You Li
  • , Yanglin Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The multi-agent real-time strategy game problem is a classic problem in the field of reinforcement learning, and solving such a problem is of high instructive significance to the economic and military fields in real society. In recent years, researchers from many countries have made breakthroughs in the related problems, but most related technologies target specific environments or require high computing power platforms. This leads to an exponential increase in the time and resources consumed in training models when the complexity and scope of a task increases. In this paper, we proposed OB-HPPO, an option and intrinsic curiosity based hierarchical reinforcement learning framework to address these challenges. Our approach hierarchically decomposes a huge action space into several self-explainable options, simplifying atomic action decisions into a series of action decisions. OB-HPPO also introduces an intrinsic curiosity module (ICM) based on the Proximal Policy Optimization (PPO) algorithm to improve the efficiency of model training and exploration. Experimental results show that OB-HPPO takes less training time and accumulates more rewards than non-hierarchical models. We also test OB-HPPO against some representative AI models of the μRTS environment, and OB-HPPO's winning rate is significantly improved.

Original languageEnglish
Title of host publicationAdvanced Intelligent Computing Technology and Applications - 20th International Conference, ICIC 2024, Proceedings
EditorsDe-Shuang Huang, Yijie Pan, Xiankun Zhang
PublisherSpringer Science and Business Media Deutschland GmbH
Pages443-454
Number of pages12
ISBN (Print)9789819755806
DOIs
Publication statusPublished - 2024
Event20th International Conference on Intelligent Computing, ICIC 2024 - Tianjin, China
Duration: 5 Aug 20248 Aug 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14863 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th International Conference on Intelligent Computing, ICIC 2024
Country/TerritoryChina
CityTianjin
Period5/08/248/08/24

Keywords

  • Hierarchical reinforcement learning
  • Modular hierarchical command
  • Option
  • Proximal policy optimization
  • Real-time strategy game

Fingerprint

Dive into the research topics of 'OB-HPPO: An Option and Intrinsic Curiosity Based Hierarchical Reinforcement Learning Approach for Real-Time Strategy Games'. Together they form a unique fingerprint.

Cite this