Risk-Conscious Mutations in Jump-Start Reinforcement Learning for Autonomous Racing Policy

Xiaohui Hou, Minggang Gan*, Wei Wu, Shiyue Zhao, Yuan Ji, Jie Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This study focuses on trajectory planning and motion control policies in autonomous racing, which necessitates pushing the capacity boundaries of racing vehicles to achieve maximum speeds and minimal lap times. We propose an innovative planning control framework that integrates risk-conscious mutations in jump-start reinforcement learning (RCM-JSRL) and nonlinear model predictive control (NMPC). The RCM-JSRL algorithm incorporates jump-start curriculum learning and the risk-conscious genetic algorithm into reinforcement learning, leveraging prior expert knowledge and a curiosity-driven exploration mechanism to enhance training efficiency while avoiding excessively conservative policy generation in high-complexity and high-risk scenarios. NMPC generates locally optimal control commands that adhere to vehicle dynamics constraints while following the designated trajectory. Following training on track maps with varying difficulty levels, the proposed controller successfully executes a superior policy compared to the guide policy, providing evidence of its effectiveness and scalability. It is our belief that this technology can be applied in everyday driving scenarios, improving efficiency under special conditions, ensuring stability in critical situations, and broadening the scope of autonomous driving applications.

Original languageEnglish
Pages (from-to)638-648
Number of pages11
JournalIEEE Transactions on Cybernetics
Volume55
Issue number2
DOIs
Publication statusPublished - 2025

Keywords

  • Autonomous racing vehicle
  • expert knowledge
  • motion control
  • reinforcement learning
  • trajectory planning

Fingerprint

Dive into the research topics of 'Risk-Conscious Mutations in Jump-Start Reinforcement Learning for Autonomous Racing Policy'. Together they form a unique fingerprint.

Cite this