A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

Hailong Zhang; Jiankun Peng; Huachun Tan; Hanxuan Dong; Fan Ding

doi:10.1109/TTE.2020.3043239

A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

Hailong Zhang, Jiankun Peng^*, Huachun Tan^*, Hanxuan Dong, Fan Ding

^*Corresponding author for this work

Southeast University, Nanjing

Research output: Contribution to journal › Article › peer-review

39 Citations (Scopus)

Abstract

Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

Original language	English
Article number	9286514
Pages (from-to)	1146-1160
Number of pages	15
Journal	IEEE Transactions on Transportation Electrification
Volume	7
Issue number	3
DOIs	https://doi.org/10.1109/TTE.2020.3043239
Publication status	Published - Sept 2021
Externally published	Yes

Keywords

Energy management
Lagrangian relaxation
plug-in hybrid electric vehicle (PHEV)
reinforcement learning (RL)
training safety

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/TTE.2020.3043239

Cite this

@article{8ef7f657418940f8af673889374cf566,

title = "A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle",

abstract = "Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.",

keywords = "Energy management, Lagrangian relaxation, plug-in hybrid electric vehicle (PHEV), reinforcement learning (RL), training safety",

author = "Hailong Zhang and Jiankun Peng and Huachun Tan and Hanxuan Dong and Fan Ding",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.",

year = "2021",

month = sep,

doi = "10.1109/TTE.2020.3043239",

language = "English",

volume = "7",

pages = "1146--1160",

journal = "IEEE Transactions on Transportation Electrification",

issn = "2332-7782",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

AU - Zhang, Hailong

AU - Peng, Jiankun

AU - Tan, Huachun

AU - Dong, Hanxuan

AU - Ding, Fan

PY - 2021/9

Y1 - 2021/9

N2 - Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

AB - Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

KW - Energy management

KW - Lagrangian relaxation

KW - plug-in hybrid electric vehicle (PHEV)

KW - reinforcement learning (RL)

KW - training safety

UR - http://www.scopus.com/inward/record.url?scp=85097943096&partnerID=8YFLogxK

U2 - 10.1109/TTE.2020.3043239

DO - 10.1109/TTE.2020.3043239

M3 - Article

AN - SCOPUS:85097943096

SN - 2332-7782

VL - 7

SP - 1146

EP - 1160

JO - IEEE Transactions on Transportation Electrification

JF - IEEE Transactions on Transportation Electrification

IS - 3

M1 - 9286514

ER -

A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this