摘要
Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.
源语言 | 英语 |
---|---|
页(从-至) | 18049-18060 |
页数 | 12 |
期刊 | Applied Intelligence |
卷 | 52 |
期 | 15 |
DOI | |
出版状态 | 已出版 - 12月 2022 |