Q-Advantage Integrated Human-Guided Reinforcement Learning for Safe End-to-End Autonomous Driving

  • Yong Wang
  • , Pei Wang
  • , Hongwen He*
  • , Jingda Wu
  • , Yingjuan Tang
  • , Zirui Kuang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement learning (RL) is a promising approach for end-to-end autonomous driving, but its practical deployment remains challenging due to low sample efficiency and sensitivity to reward design. To address these challenges, this study presents a novel Q-advantage integrated human-guided reinforcement learning (QIHG-RL) framework that effectively combines the strengths of machine learning and human expertise. The QIHG-RL framework features: 1) an ensemble Q-advantage function that aggregates multiple value networks to enhance value estimation, and 2) an integration mechanism that embeds the Q-advantage into both the actor-critic network and the prioritized experience replay. This design allows the agent to leverage sparse and sub-optimal human demonstrations, accelerating policy learning in the early training phase while gradually enhancing exploration as training progresses. The framework is evaluated across three safety-critical driving tasks. Experimental results show a 167% improvement in sample efficiency compared to standard RL methods and a 14% performance gain over a state-of-the-art human-guided RL baseline. Furthermore, a Sim2Real pipeline combining domain randomization and semantic denoised remapping facilitates successful deployment on a real-world autonomous vehicle.

Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
DOIs
Publication statusAccepted/In press - 2026
Externally publishedYes

Fingerprint

Dive into the research topics of 'Q-Advantage Integrated Human-Guided Reinforcement Learning for Safe End-to-End Autonomous Driving'. Together they form a unique fingerprint.

Cite this