Merging planning in dense traffic scenarios using interactive safe reinforcement learning

Xiaohui Hou, Minggang Gan*, Wei Wu, Chenyu Wang, Yuan Ji, Shiyue Zhao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Autonomous navigation in dense traffic scenarios, such as on-ramp forced merging, still poses significant challenges for autonomous vehicles to prevent accidents and alleviate traffic congestion. This paper introduces a novel motion planning framework that combines Interactive Safe Reinforcement Learning (IntSRL) with Nonlinear Model Predictive Control (NMPC). This framework develops an interactive merging planning policy that accounts for the uncertainty of traffic participants, multi-objective optimization and heterogeneous vehicle interactions, in which the upper planner, i.e., IntSRL, furnishes the lower planner, NMPC, with global guidance path and velocity guidance. An Adaptive Safety Governor (ASG) module within IntSRL adjusts potentially unsafe actions by incorporating prior knowledge and driving experience. And a coupling evaluation mechanism for multi-objective optimization is embedded into reward shaping with integration of driving safety and strategy efficiency. We evaluate the proposed controller on various dense traffic scenarios using the proposed Heterogeneous Intelligent Driver Model (H-IDM) considering different driving styles and cooperative willingness of other vehicles. The test results indicate that the proposed method surpasses existing optimization-based and learning-based baselines in qualitative and quantitative measures.

Original languageEnglish
Article number111548
JournalKnowledge-Based Systems
Volume290
DOIs
Publication statusPublished - 22 Apr 2024

Keywords

  • Autonomous driving
  • Dense traffic scenario
  • Motion planning
  • Reinforcement learning
  • Safety constraint

Fingerprint

Dive into the research topics of 'Merging planning in dense traffic scenarios using interactive safe reinforcement learning'. Together they form a unique fingerprint.

Cite this