TY - JOUR
T1 - A twin delayed deep deterministic policy gradient-based energy management strategy for a battery-ultracapacitor electric vehicle considering driving condition recognition with learning vector quantization neural network
AU - Liu, Rui
AU - Wang, Chun
AU - Tang, Aihua
AU - Zhang, Yongzhi
AU - Yu, Quanqing
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/11/1
Y1 - 2023/11/1
N2 - Deep reinforcement learning algorithms have been widely applied in the energy management of hybrid energy storage systems. However, these deep reinforcement learning algorithms, such as DQN and DDPG, have the problem of discontinuous action space and consistently overestimated Q values. To address this issue, a novel energy management strategy based on a twin delayed deep deterministic policy gradient (TD3) algorithm is proposed for the battery-ultracapacitor electric vehicles in this study. In addition, the driving condition recognition method is integrated into the energy management strategy framework to reduce the training time of the TD3 agent. The detailed implementation steps are as follows. At first, dynamic experiments were performed to establish high-precision models of the battery and ultracapacitor. Secondly, learning vector quantization neural networks are applied to classify driving conditions, namely, urban, suburban and highway conditions. Furthermore, three parallel TD3 agents are trained for urban, suburban and highway conditions, respectively. Finally, the proposed strategy is evaluated under standard driving cycles. The simulation results indicate that compared with the TD3-based strategy, the proposed strategy improves the economy by 1 % and reduces the training time by 34 %, and the economic gap with the dynamic programming-based energy management strategy is narrowed down to 3 %.
AB - Deep reinforcement learning algorithms have been widely applied in the energy management of hybrid energy storage systems. However, these deep reinforcement learning algorithms, such as DQN and DDPG, have the problem of discontinuous action space and consistently overestimated Q values. To address this issue, a novel energy management strategy based on a twin delayed deep deterministic policy gradient (TD3) algorithm is proposed for the battery-ultracapacitor electric vehicles in this study. In addition, the driving condition recognition method is integrated into the energy management strategy framework to reduce the training time of the TD3 agent. The detailed implementation steps are as follows. At first, dynamic experiments were performed to establish high-precision models of the battery and ultracapacitor. Secondly, learning vector quantization neural networks are applied to classify driving conditions, namely, urban, suburban and highway conditions. Furthermore, three parallel TD3 agents are trained for urban, suburban and highway conditions, respectively. Finally, the proposed strategy is evaluated under standard driving cycles. The simulation results indicate that compared with the TD3-based strategy, the proposed strategy improves the economy by 1 % and reduces the training time by 34 %, and the economic gap with the dynamic programming-based energy management strategy is narrowed down to 3 %.
KW - Deep reinforcement learning
KW - Driving cycle construction
KW - Energy management strategy
KW - Hybrid energy storage system
KW - Parameter identification
UR - http://www.scopus.com/inward/record.url?scp=85163304531&partnerID=8YFLogxK
U2 - 10.1016/j.est.2023.108147
DO - 10.1016/j.est.2023.108147
M3 - Article
AN - SCOPUS:85163304531
SN - 2352-152X
VL - 71
JO - Journal of Energy Storage
JF - Journal of Energy Storage
M1 - 108147
ER -