TY - JOUR
T1 - Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles
AU - Liu, Qi
AU - Tang, Yujie
AU - Li, Xueyuan
AU - Wang, Kaifeng
AU - Yang, Fan
AU - Li, Zirui
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/8
Y1 - 2025/8
N2 - Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.
AB - Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.
KW - Connected and autonomous vehicles
KW - Curiosity mechanism
KW - Decision-making
KW - Graph reinforcement learning
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=105007087994&partnerID=8YFLogxK
U2 - 10.1016/j.trc.2025.105183
DO - 10.1016/j.trc.2025.105183
M3 - Article
AN - SCOPUS:105007087994
SN - 0968-090X
VL - 177
JO - Transportation Research Part C: Emerging Technologies
JF - Transportation Research Part C: Emerging Technologies
M1 - 105183
ER -