Abstract
Hierarchical Reinforcement Learning (HRL) is primarily proposed for addressing problems with sparse reward signals and a long time horizon. Many existing HRL algorithms use neural networks to automatically produce goals, which have not taken into account that not all goals advance the task. Some methods have addressed the optimization of goal generation, while the goal is represented by the specific value of the state. In this paper, we propose a novel HRL algorithm for automatically discovering goals, which solves the problem for the optimization of goal generation in the latent state space by screening goals and selecting policies. We compare our approach with the state-of-the-art algorithms on Atari 2600 games and the results show that it can speed up learning and improve performance.
Original language | English |
---|---|
Pages (from-to) | 18049-18060 |
Number of pages | 12 |
Journal | Applied Intelligence |
Volume | 52 |
Issue number | 15 |
DOIs | |
Publication status | Published - Dec 2022 |
Keywords
- Automatic goal discovery
- Goal screening
- Hierarchical reinforcement learning
- Policy selection