Abstract
Deep Reinforcement Learning (DRL) has achieved significant advancements in the transportation domain, effectively enhancing traffic network efficiency, reducing pollutant emissions, and improving driving safety. A prominent approach within DRL, Hierarchical Reinforcement Learning (HRL), simplifies complex tasks by grouping states and decomposing the Markov Decision Process (MDP), facilitating the exploration in multi-dimensional state spaces. These concepts of state abstraction and temporal abstraction prove to be particularly beneficial in complex, high-risk transportation scenarios, such as on-ramp merging. In this context, this paper introduces a novel Multi-Option HRL (MO-HRL) framework with state segmentation. Unlike traditional option-based HRL, the proposed framework enables the simultaneous activation of multiple options, with each option observing diverse states. After carefully defining and justifying the framework, we apply MO-HRL to a simplified on-ramp merging scenario. To enhance training, curriculum learning is incorporated into the MO-HRL framework. Extensive experiments involve discussions of different training modes, the “shared critic” problem, and comparisons with state-of-the-art baselines. Additionally, a six-lane mainline on-ramp merging scenario, based on the NGSIM I-80 dataset, is constructed. Simulation results from both scenarios show that the proposed approach outperforms existing methods and maintains a balance between the mainline and on-ramp traffic.
| Original language | English |
|---|---|
| Pages (from-to) | 22246-22261 |
| Number of pages | 16 |
| Journal | IEEE Transactions on Intelligent Transportation Systems |
| Volume | 26 |
| Issue number | 12 |
| DOIs | |
| Publication status | Published - 2025 |
Keywords
- On-ramp merging
- curriculum learning
- hierarchical reinforcement learning
- option framework
Fingerprint
Dive into the research topics of 'Multi-Option Hierarchical Reinforcement Learning Framework With State Segmentation for Mixed On-Ramp Merging'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver