A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

Hailong Zhang; Jiankun Peng; Huachun Tan; Hanxuan Dong; Fan Ding

doi:10.1109/TTE.2020.3043239

A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

Hailong Zhang, Jiankun Peng^*, Huachun Tan^*, Hanxuan Dong, Fan Ding

^*此作品的通讯作者

Southeast University, Nanjing

科研成果: 期刊稿件 › 文章 › 同行评审

39 引用（Scopus）

摘要

Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

源语言	英语
文章编号	9286514
页（从-至）	1146-1160
页数	15
期刊	IEEE Transactions on Transportation Electrification
卷	7
期	3
DOI	https://doi.org/10.1109/TTE.2020.3043239
出版状态	已出版 - 9月 2021
已对外发布	是

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1109/TTE.2020.3043239

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{8ef7f657418940f8af673889374cf566,

title = "A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle",

abstract = "Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.",

keywords = "Energy management, Lagrangian relaxation, plug-in hybrid electric vehicle (PHEV), reinforcement learning (RL), training safety",

author = "Hailong Zhang and Jiankun Peng and Huachun Tan and Hanxuan Dong and Fan Ding",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.",

year = "2021",

month = sep,

doi = "10.1109/TTE.2020.3043239",

language = "English",

volume = "7",

pages = "1146--1160",

journal = "IEEE Transactions on Transportation Electrification",

issn = "2332-7782",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

AU - Zhang, Hailong

AU - Peng, Jiankun

AU - Tan, Huachun

AU - Dong, Hanxuan

AU - Ding, Fan

PY - 2021/9

Y1 - 2021/9

N2 - Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

AB - Reinforcement learning (RL)-based energy management is one of the current hot spots of hybrid electric vehicles. Recent advances in RL-based energy management focus on energy-saving performance but less considers the constrained setting for training safety. This article proposes an RL framework named coach-actor-double-critic (CADC) for the optimization of energy management considered as the constrained Markov decision process (CMDP). A bilevel onboard controller includes a neural network (NN)-based strategy actor and rule-based strategy coach for online energy management. Once the output of the actor exceeds the constrained range of feasible solutions, the coach would take charge of energy management to ensure safety. By using the Lagrangian relaxation, the optimization for CMDP transforms into an unconstrained dual problem to minimize the energy consumption while minimizing the coach participation. The parameters of the actor are updated in a manner of policy gradient through RL training with the Lagrangian value function. Double-critic with the same structure synchronously estimates the value function to avoid overestimate bias. Several experiments with the bus trajectories data demonstrate the optimality, self-learning ability, and adaptability of CADC. The results indicate that CADC outperforms the existing RL-based strategies and reaches above 95% energy-saving rate of the off-line global optimum.

KW - Energy management

KW - Lagrangian relaxation

KW - plug-in hybrid electric vehicle (PHEV)

KW - reinforcement learning (RL)

KW - training safety

UR - http://www.scopus.com/inward/record.url?scp=85097943096&partnerID=8YFLogxK

U2 - 10.1109/TTE.2020.3043239

DO - 10.1109/TTE.2020.3043239

M3 - Article

AN - SCOPUS:85097943096

SN - 2332-7782

VL - 7

SP - 1146

EP - 1160

JO - IEEE Transactions on Transportation Electrification

JF - IEEE Transactions on Transportation Electrification

IS - 3

M1 - 9286514

ER -

A Deep Reinforcement Learning-Based Energy Management Framework with Lagrangian Relaxation for Plug-In Hybrid Electric Vehicle

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此