Tackling SOC long-term dynamic for energy management of hybrid electric buses via adaptive policy optimization

Hailong Zhang; Jiankun Peng; Huachun Tan; Hanxuan Dong; Fan Ding; Bin Ran

doi:10.1016/j.apenergy.2020.115031

Tackling SOC long-term dynamic for energy management of hybrid electric buses via adaptive policy optimization

Hailong Zhang, Jiankun Peng^*, Huachun Tan, Hanxuan Dong, Fan Ding, Bin Ran

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

23 Citations (Scopus)

Abstract

Plug-in hybrid electric buses (PHEBs) have the potential to satisfy both the fuel efficiency and the driving-mileage under complex urban traffic conditions. However, the optimal charge and discharge management is still a pivotal challenge of energy management for the inherent uncertainty in driving conditions. The common reference state-of-charge (SOC) profile based methods are limited by the adaptiveness which restricts the economic performance of on-line energy management systems. Promisingly, reinforcement learning based energy management strategies exhibited the significant self-learning ability. However, for PHEBs, the sparse rewards by the long-term SOC shortage make the strategies easily trick into the local optimal solution. The work presented in this paper concentrates on combining battery power reduction in the form of conditional entropy into reinforcement learning based energy management strategy. The proposed method named adaptive policy optimization (APO) introduces a novel advantage function to evaluate energy-saving performance considering long-term SOC dynamic, and a Bayesian neural network based SOC shortage probability estimator is utilized to optimize the energy management strategy parameterized by a deep neural network. Several experiments in a standard driving cycle demonstrate the optimality, self-learning ability and convergence of the APO. Moreover, the adaptability and robust performance get validated over the real bus trajectories data. With the comprehensive experiments in this paper, the proposed model exhibits enhanced fuel economy and more suitable SOC planning in comparison with the existing energy management strategies. The results indicate that APO respectively outperforms the compared online strategies by 9.8% and 2.6% and reaches 98% energy-saving rate of the offline global optimum.

Original language	English
Article number	115031
Journal	Applied Energy
Volume	269
DOIs	https://doi.org/10.1016/j.apenergy.2020.115031
Publication status	Published - 1 Jul 2020
Externally published	Yes

Keywords

Energy management
Intelligent bus system
Plug-in hybrid electric vehicle
Reinforcement learning
Trajectory data mining

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.apenergy.2020.115031

Cite this

@article{79d4f6841ddf4b52b190c3457624a678,

title = "Tackling SOC long-term dynamic for energy management of hybrid electric buses via adaptive policy optimization",

abstract = "Plug-in hybrid electric buses (PHEBs) have the potential to satisfy both the fuel efficiency and the driving-mileage under complex urban traffic conditions. However, the optimal charge and discharge management is still a pivotal challenge of energy management for the inherent uncertainty in driving conditions. The common reference state-of-charge (SOC) profile based methods are limited by the adaptiveness which restricts the economic performance of on-line energy management systems. Promisingly, reinforcement learning based energy management strategies exhibited the significant self-learning ability. However, for PHEBs, the sparse rewards by the long-term SOC shortage make the strategies easily trick into the local optimal solution. The work presented in this paper concentrates on combining battery power reduction in the form of conditional entropy into reinforcement learning based energy management strategy. The proposed method named adaptive policy optimization (APO) introduces a novel advantage function to evaluate energy-saving performance considering long-term SOC dynamic, and a Bayesian neural network based SOC shortage probability estimator is utilized to optimize the energy management strategy parameterized by a deep neural network. Several experiments in a standard driving cycle demonstrate the optimality, self-learning ability and convergence of the APO. Moreover, the adaptability and robust performance get validated over the real bus trajectories data. With the comprehensive experiments in this paper, the proposed model exhibits enhanced fuel economy and more suitable SOC planning in comparison with the existing energy management strategies. The results indicate that APO respectively outperforms the compared online strategies by 9.8% and 2.6% and reaches 98% energy-saving rate of the offline global optimum.",

keywords = "Energy management, Intelligent bus system, Plug-in hybrid electric vehicle, Reinforcement learning, Trajectory data mining",

author = "Hailong Zhang and Jiankun Peng and Huachun Tan and Hanxuan Dong and Fan Ding and Bin Ran",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier Ltd",

year = "2020",

month = jul,

day = "1",

doi = "10.1016/j.apenergy.2020.115031",

language = "English",

volume = "269",

journal = "Applied Energy",

issn = "0306-2619",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Tackling SOC long-term dynamic for energy management of hybrid electric buses via adaptive policy optimization

AU - Zhang, Hailong

AU - Peng, Jiankun

AU - Tan, Huachun

AU - Dong, Hanxuan

AU - Ding, Fan

AU - Ran, Bin

PY - 2020/7/1

Y1 - 2020/7/1

N2 - Plug-in hybrid electric buses (PHEBs) have the potential to satisfy both the fuel efficiency and the driving-mileage under complex urban traffic conditions. However, the optimal charge and discharge management is still a pivotal challenge of energy management for the inherent uncertainty in driving conditions. The common reference state-of-charge (SOC) profile based methods are limited by the adaptiveness which restricts the economic performance of on-line energy management systems. Promisingly, reinforcement learning based energy management strategies exhibited the significant self-learning ability. However, for PHEBs, the sparse rewards by the long-term SOC shortage make the strategies easily trick into the local optimal solution. The work presented in this paper concentrates on combining battery power reduction in the form of conditional entropy into reinforcement learning based energy management strategy. The proposed method named adaptive policy optimization (APO) introduces a novel advantage function to evaluate energy-saving performance considering long-term SOC dynamic, and a Bayesian neural network based SOC shortage probability estimator is utilized to optimize the energy management strategy parameterized by a deep neural network. Several experiments in a standard driving cycle demonstrate the optimality, self-learning ability and convergence of the APO. Moreover, the adaptability and robust performance get validated over the real bus trajectories data. With the comprehensive experiments in this paper, the proposed model exhibits enhanced fuel economy and more suitable SOC planning in comparison with the existing energy management strategies. The results indicate that APO respectively outperforms the compared online strategies by 9.8% and 2.6% and reaches 98% energy-saving rate of the offline global optimum.

AB - Plug-in hybrid electric buses (PHEBs) have the potential to satisfy both the fuel efficiency and the driving-mileage under complex urban traffic conditions. However, the optimal charge and discharge management is still a pivotal challenge of energy management for the inherent uncertainty in driving conditions. The common reference state-of-charge (SOC) profile based methods are limited by the adaptiveness which restricts the economic performance of on-line energy management systems. Promisingly, reinforcement learning based energy management strategies exhibited the significant self-learning ability. However, for PHEBs, the sparse rewards by the long-term SOC shortage make the strategies easily trick into the local optimal solution. The work presented in this paper concentrates on combining battery power reduction in the form of conditional entropy into reinforcement learning based energy management strategy. The proposed method named adaptive policy optimization (APO) introduces a novel advantage function to evaluate energy-saving performance considering long-term SOC dynamic, and a Bayesian neural network based SOC shortage probability estimator is utilized to optimize the energy management strategy parameterized by a deep neural network. Several experiments in a standard driving cycle demonstrate the optimality, self-learning ability and convergence of the APO. Moreover, the adaptability and robust performance get validated over the real bus trajectories data. With the comprehensive experiments in this paper, the proposed model exhibits enhanced fuel economy and more suitable SOC planning in comparison with the existing energy management strategies. The results indicate that APO respectively outperforms the compared online strategies by 9.8% and 2.6% and reaches 98% energy-saving rate of the offline global optimum.

KW - Energy management

KW - Intelligent bus system

KW - Plug-in hybrid electric vehicle

KW - Reinforcement learning

KW - Trajectory data mining

UR - http://www.scopus.com/inward/record.url?scp=85084056362&partnerID=8YFLogxK

U2 - 10.1016/j.apenergy.2020.115031

DO - 10.1016/j.apenergy.2020.115031

M3 - Article

AN - SCOPUS:85084056362

SN - 0306-2619

VL - 269

JO - Applied Energy

JF - Applied Energy

M1 - 115031

ER -

Tackling SOC long-term dynamic for energy management of hybrid electric buses via adaptive policy optimization

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this