Online Markov Chain-based energy management for a hybrid tracked vehicle with speedy Q-learning

Teng Liu; Bo Wang; Chenglang Yang

doi:10.1016/j.energy.2018.07.022

Online Markov Chain-based energy management for a hybrid tracked vehicle with speedy Q-learning

Teng Liu, Bo Wang^*, Chenglang Yang

^*此作品的通讯作者

数学学院

科研成果: 期刊稿件 › 文章 › 同行评审

110 引用（Scopus）

摘要

This brief proposes a real-time energy management approach for a hybrid tracked vehicle to adapt to different driving conditions. To characterize different route segments online, an onboard learning algorithm for Markov Chain models is employed to generate transition probability matrices of power demand. The induced matrix norm is presented as an initialization criterion to quantify differences between multiple transition probability matrices and to determine when to update them at specific road segment. Since a series of control policies are available onboard for the hybrid tracked vehicle, the induced matrix norm is also employed to choose an appropriate control policy that matches the current driving condition best. To accelerate the convergence rate in Markov Chain-based control policy computation, a reinforcement learning-enabled energy management strategy is derived by using speedy Q-learning algorithm. Simulation is carried out on two driving cycles. And results indicate that the proposed energy management strategy can greatly improve the fuel economy and be employed in real-time when compared with the stochastic dynamic programming and conventional RL approaches.

源语言	英语
页（从-至）	544-555
页数	12
期刊	Energy
卷	160
DOI	https://doi.org/10.1016/j.energy.2018.07.022
出版状态	已出版 - 1 10月 2018

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1016/j.energy.2018.07.022

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{7925f49b263f446195b3ecaf503b675f,

title = "Online Markov Chain-based energy management for a hybrid tracked vehicle with speedy Q-learning",

abstract = "This brief proposes a real-time energy management approach for a hybrid tracked vehicle to adapt to different driving conditions. To characterize different route segments online, an onboard learning algorithm for Markov Chain models is employed to generate transition probability matrices of power demand. The induced matrix norm is presented as an initialization criterion to quantify differences between multiple transition probability matrices and to determine when to update them at specific road segment. Since a series of control policies are available onboard for the hybrid tracked vehicle, the induced matrix norm is also employed to choose an appropriate control policy that matches the current driving condition best. To accelerate the convergence rate in Markov Chain-based control policy computation, a reinforcement learning-enabled energy management strategy is derived by using speedy Q-learning algorithm. Simulation is carried out on two driving cycles. And results indicate that the proposed energy management strategy can greatly improve the fuel economy and be employed in real-time when compared with the stochastic dynamic programming and conventional RL approaches.",

keywords = "Hybrid tracked vehicle, Induced matrix norm, Markov chain, Onboard learning algorithm, Reinforcement learning, Speedy Q-learning",

author = "Teng Liu and Bo Wang and Chenglang Yang",

note = "Publisher Copyright: {\textcopyright} 2018 Elsevier Ltd",

year = "2018",

month = oct,

day = "1",

doi = "10.1016/j.energy.2018.07.022",

language = "English",

volume = "160",

pages = "544--555",

journal = "Energy",

issn = "0360-5442",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Online Markov Chain-based energy management for a hybrid tracked vehicle with speedy Q-learning

AU - Liu, Teng

AU - Wang, Bo

AU - Yang, Chenglang

PY - 2018/10/1

Y1 - 2018/10/1

N2 - This brief proposes a real-time energy management approach for a hybrid tracked vehicle to adapt to different driving conditions. To characterize different route segments online, an onboard learning algorithm for Markov Chain models is employed to generate transition probability matrices of power demand. The induced matrix norm is presented as an initialization criterion to quantify differences between multiple transition probability matrices and to determine when to update them at specific road segment. Since a series of control policies are available onboard for the hybrid tracked vehicle, the induced matrix norm is also employed to choose an appropriate control policy that matches the current driving condition best. To accelerate the convergence rate in Markov Chain-based control policy computation, a reinforcement learning-enabled energy management strategy is derived by using speedy Q-learning algorithm. Simulation is carried out on two driving cycles. And results indicate that the proposed energy management strategy can greatly improve the fuel economy and be employed in real-time when compared with the stochastic dynamic programming and conventional RL approaches.

AB - This brief proposes a real-time energy management approach for a hybrid tracked vehicle to adapt to different driving conditions. To characterize different route segments online, an onboard learning algorithm for Markov Chain models is employed to generate transition probability matrices of power demand. The induced matrix norm is presented as an initialization criterion to quantify differences between multiple transition probability matrices and to determine when to update them at specific road segment. Since a series of control policies are available onboard for the hybrid tracked vehicle, the induced matrix norm is also employed to choose an appropriate control policy that matches the current driving condition best. To accelerate the convergence rate in Markov Chain-based control policy computation, a reinforcement learning-enabled energy management strategy is derived by using speedy Q-learning algorithm. Simulation is carried out on two driving cycles. And results indicate that the proposed energy management strategy can greatly improve the fuel economy and be employed in real-time when compared with the stochastic dynamic programming and conventional RL approaches.

KW - Hybrid tracked vehicle

KW - Induced matrix norm

KW - Markov chain

KW - Onboard learning algorithm

KW - Reinforcement learning

KW - Speedy Q-learning

UR - http://www.scopus.com/inward/record.url?scp=85053079756&partnerID=8YFLogxK

U2 - 10.1016/j.energy.2018.07.022

DO - 10.1016/j.energy.2018.07.022

M3 - Article

AN - SCOPUS:85053079756

SN - 0360-5442

VL - 160

SP - 544

EP - 555

JO - Energy

JF - Energy

ER -

Online Markov Chain-based energy management for a hybrid tracked vehicle with speedy Q-learning

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此