Abstract
Reinforcement learning (RL) is a promising approach for end-to-end autonomous driving, but its practical deployment remains challenging due to low sample efficiency and sensitivity to reward design. To address these challenges, this study presents a novel Q-advantage integrated human-guided reinforcement learning (QIHG-RL) framework that effectively combines the strengths of machine learning and human expertise. The QIHG-RL framework features: 1) an ensemble Q-advantage function that aggregates multiple value networks to enhance value estimation, and 2) an integration mechanism that embeds the Q-advantage into both the actor-critic network and the prioritized experience replay. This design allows the agent to leverage sparse and sub-optimal human demonstrations, accelerating policy learning in the early training phase while gradually enhancing exploration as training progresses. The framework is evaluated across three safety-critical driving tasks. Experimental results show a 167% improvement in sample efficiency compared to standard RL methods and a 14% performance gain over a state-of-the-art human-guided RL baseline. Furthermore, a Sim2Real pipeline combining domain randomization and semantic denoised remapping facilitates successful deployment on a real-world autonomous vehicle.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Intelligent Transportation Systems |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
| Externally published | Yes |
Fingerprint
Dive into the research topics of 'Q-Advantage Integrated Human-Guided Reinforcement Learning for Safe End-to-End Autonomous Driving'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver