Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights

Chuntao Zhang; Wenhui Huang; Xingyu Zhou; Chen Lv; Chao Sun

doi:10.1016/j.energy.2023.129472

Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights

Chuntao Zhang, Wenhui Huang, Xingyu Zhou, Chen Lv, Chao Sun^*

^*Corresponding author for this work

School of Mechanical Engineering

Research output: Contribution to journal › Article › peer-review

5 Citations (Scopus)

Abstract

Eco-driving methods incorporating lateral motion exhibit enhanced energy-saving prospects in multi-lane traffic contexts, yet the randomly distributed obstructing vehicles and sparse traffic lights pose challenges in assessing the long-term value of instantaneous actions, impeding further improvement in energy efficiency. In response to this issue, a deep reinforcement learning (DRL)-based eco-driving method is proposed and augmented with the expert demonstration mechanism. Specifically, a Markov decision process matching with the target eco-driving scenario is systematically constructed, with which, the formulated DRL algorithm, parametrized soft actor-critic (PSAC), is trained to realize the integrated optimization of speed planning and lane-changing maneuver. To promote the training performance of PSAC under sparse rewards concerning traffic lights, an expert eco-driving model and an adaptive sampling approach are incorporated to constitute the expert demonstration mechanism. Simulation results highlight the superior performance of the proposed DRL-based eco-driving method and its training mechanism. Compared with the performance of the PSAC with a pure exploration-based training mechanism, the expert demonstration mechanism promotes the training efficiency and cumulated rewards of PSAC by about 60 % and 21.89 % respectively in the training phase, while in the test phase, a further reduction of 4.23 % benchmarked on a rule-based method is achieved in fuel consumption.

Original language	English
Article number	129472
Journal	Energy
Volume	286
DOIs	https://doi.org/10.1016/j.energy.2023.129472
Publication status	Published - 1 Jan 2024

Keywords

Eco-driving
Energy economy
Expert demonstration
Reinforcement learning

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.energy.2023.129472

Cite this

Zhang, C., Huang, W., Zhou, X., Lv, C., & Sun, C. (2024). Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights. Energy, 286, Article 129472. https://doi.org/10.1016/j.energy.2023.129472

@article{99f861f13ca64ab8a2b72fc6ec081e9a,

title = "Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights",

abstract = "Eco-driving methods incorporating lateral motion exhibit enhanced energy-saving prospects in multi-lane traffic contexts, yet the randomly distributed obstructing vehicles and sparse traffic lights pose challenges in assessing the long-term value of instantaneous actions, impeding further improvement in energy efficiency. In response to this issue, a deep reinforcement learning (DRL)-based eco-driving method is proposed and augmented with the expert demonstration mechanism. Specifically, a Markov decision process matching with the target eco-driving scenario is systematically constructed, with which, the formulated DRL algorithm, parametrized soft actor-critic (PSAC), is trained to realize the integrated optimization of speed planning and lane-changing maneuver. To promote the training performance of PSAC under sparse rewards concerning traffic lights, an expert eco-driving model and an adaptive sampling approach are incorporated to constitute the expert demonstration mechanism. Simulation results highlight the superior performance of the proposed DRL-based eco-driving method and its training mechanism. Compared with the performance of the PSAC with a pure exploration-based training mechanism, the expert demonstration mechanism promotes the training efficiency and cumulated rewards of PSAC by about 60 % and 21.89 % respectively in the training phase, while in the test phase, a further reduction of 4.23 % benchmarked on a rule-based method is achieved in fuel consumption.",

keywords = "Eco-driving, Energy economy, Expert demonstration, Reinforcement learning",

author = "Chuntao Zhang and Wenhui Huang and Xingyu Zhou and Chen Lv and Chao Sun",

note = "Publisher Copyright: {\textcopyright} 2023",

year = "2024",

month = jan,

day = "1",

doi = "10.1016/j.energy.2023.129472",

language = "English",

volume = "286",

journal = "Energy",

issn = "0360-5442",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights

AU - Zhang, Chuntao

AU - Huang, Wenhui

AU - Zhou, Xingyu

AU - Lv, Chen

AU - Sun, Chao

PY - 2024/1/1

Y1 - 2024/1/1

N2 - Eco-driving methods incorporating lateral motion exhibit enhanced energy-saving prospects in multi-lane traffic contexts, yet the randomly distributed obstructing vehicles and sparse traffic lights pose challenges in assessing the long-term value of instantaneous actions, impeding further improvement in energy efficiency. In response to this issue, a deep reinforcement learning (DRL)-based eco-driving method is proposed and augmented with the expert demonstration mechanism. Specifically, a Markov decision process matching with the target eco-driving scenario is systematically constructed, with which, the formulated DRL algorithm, parametrized soft actor-critic (PSAC), is trained to realize the integrated optimization of speed planning and lane-changing maneuver. To promote the training performance of PSAC under sparse rewards concerning traffic lights, an expert eco-driving model and an adaptive sampling approach are incorporated to constitute the expert demonstration mechanism. Simulation results highlight the superior performance of the proposed DRL-based eco-driving method and its training mechanism. Compared with the performance of the PSAC with a pure exploration-based training mechanism, the expert demonstration mechanism promotes the training efficiency and cumulated rewards of PSAC by about 60 % and 21.89 % respectively in the training phase, while in the test phase, a further reduction of 4.23 % benchmarked on a rule-based method is achieved in fuel consumption.

AB - Eco-driving methods incorporating lateral motion exhibit enhanced energy-saving prospects in multi-lane traffic contexts, yet the randomly distributed obstructing vehicles and sparse traffic lights pose challenges in assessing the long-term value of instantaneous actions, impeding further improvement in energy efficiency. In response to this issue, a deep reinforcement learning (DRL)-based eco-driving method is proposed and augmented with the expert demonstration mechanism. Specifically, a Markov decision process matching with the target eco-driving scenario is systematically constructed, with which, the formulated DRL algorithm, parametrized soft actor-critic (PSAC), is trained to realize the integrated optimization of speed planning and lane-changing maneuver. To promote the training performance of PSAC under sparse rewards concerning traffic lights, an expert eco-driving model and an adaptive sampling approach are incorporated to constitute the expert demonstration mechanism. Simulation results highlight the superior performance of the proposed DRL-based eco-driving method and its training mechanism. Compared with the performance of the PSAC with a pure exploration-based training mechanism, the expert demonstration mechanism promotes the training efficiency and cumulated rewards of PSAC by about 60 % and 21.89 % respectively in the training phase, while in the test phase, a further reduction of 4.23 % benchmarked on a rule-based method is achieved in fuel consumption.

KW - Eco-driving

KW - Energy economy

KW - Expert demonstration

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85176233166&partnerID=8YFLogxK

U2 - 10.1016/j.energy.2023.129472

DO - 10.1016/j.energy.2023.129472

M3 - Article

AN - SCOPUS:85176233166

SN - 0360-5442

VL - 286

JO - Energy

JF - Energy

M1 - 129472

ER -

Expert-demonstration-augmented reinforcement learning for lane-change-aware eco-driving traversing consecutive traffic lights

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this