Abstract
Reinforcement learning (RL) is a promising approach for end-to-end autonomous driving, but its practical deployment remains challenging due to low sample efficiency and sensitivity to reward design. To address these challenges, this study presents a novel Q-advantage integrated human-guided reinforcement learning (QIHG-RL) framework that effectively combines the strengths of machine learning and human expertise. The QIHG-RL framework features: 1) an ensemble Q-advantage function that aggregates multiple value networks to enhance value estimation, and 2) an integration mechanism that embeds the Q-advantage into both the actor-critic network and the prioritized experience replay. This design allows the agent to leverage sparse and sub-optimal human demonstrations, accelerating policy learning in the early training phase while gradually enhancing exploration as training progresses. The framework is evaluated across three safety-critical driving tasks. Experimental results show a 167% improvement in sample efficiency compared to standard RL methods and a 14% performance gain over a state-of-the-art human-guided RL baseline. Furthermore, a Sim2Real pipeline combining domain randomization and semantic denoised remapping facilitates successful deployment on a real-world autonomous vehicle.
| Original language | English |
|---|---|
| Pages (from-to) | 2957-2969 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Intelligent Transportation Systems |
| Volume | 27 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 2026 |
| Externally published | Yes |
Keywords
- Human-guided reinforcement learning
- Q-advantage integration
- Sim2Real transfer
- autonomous driving
- safety-critical driving
Fingerprint
Dive into the research topics of 'Q-Advantage Integrated Human-Guided Reinforcement Learning for Safe End-to-End Autonomous Driving'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver