Screening goals and selecting policies in hierarchical reinforcement learning

Junyan Zhou; Jing Chen; Yanfeng Tong; Junrui Zhang

doi:10.1007/s10489-021-03093-9

Screening goals and selecting policies in hierarchical reinforcement learning

Junyan Zhou, Jing Chen^*, Yanfeng Tong, Junrui Zhang

^*Corresponding author for this work

School of Optics and Photonics

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.

Original language	English
Pages (from-to)	18049-18060
Number of pages	12
Journal	Applied Intelligence
Volume	52
Issue number	15
DOIs	https://doi.org/10.1007/s10489-021-03093-9
Publication status	Published - Dec 2022

Keywords

Automatic goal discovery
Goal screening
Hierarchical reinforcement learning
Policy selection

Access to Document

10.1007/s10489-021-03093-9

Cite this

@article{6acfa23c978e4941b54adde155b86a7c,

title = "Screening goals and selecting policies in hierarchical reinforcement learning",

abstract = "Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.",

keywords = "Automatic goal discovery, Goal screening, Hierarchical reinforcement learning, Policy selection",

author = "Junyan Zhou and Jing Chen and Yanfeng Tong and Junrui Zhang",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.",

year = "2022",

month = dec,

doi = "10.1007/s10489-021-03093-9",

language = "English",

volume = "52",

pages = "18049--18060",

journal = "Applied Intelligence",

issn = "0924-669X",

publisher = "Springer Netherlands",

number = "15",

}

TY - JOUR

T1 - Screening goals and selecting policies in hierarchical reinforcement learning

AU - Zhou, Junyan

AU - Chen, Jing

AU - Tong, Yanfeng

AU - Zhang, Junrui

PY - 2022/12

Y1 - 2022/12

N2 - Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.

AB - Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.

KW - Automatic goal discovery

KW - Goal screening

KW - Hierarchical reinforcement learning

KW - Policy selection

UR - http://www.scopus.com/inward/record.url?scp=85127675888&partnerID=8YFLogxK

U2 - 10.1007/s10489-021-03093-9

DO - 10.1007/s10489-021-03093-9

M3 - Article

AN - SCOPUS:85127675888

SN - 0924-669X

VL - 52

SP - 18049

EP - 18060

JO - Applied Intelligence

JF - Applied Intelligence

IS - 15

ER -

Screening goals and selecting policies in hierarchical reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this