TY - JOUR
T1 - 基于归一化优势函数的强化学习混合动力履带车辆能量管理
AU - Zou, Yuan
AU - Zhang, Bin
AU - Zhang, Xudong
AU - Zhao, Zhiying
AU - Kang, Tieyu
AU - Guo, Yufeng
AU - Wu, Zhe
N1 - Publisher Copyright:
© 2021, Editorial Board of Acta Armamentarii. All right reserved.
PY - 2021/10
Y1 - 2021/10
N2 - The energy management strategy based on reinforcement learning encounters the problem of "dimension disaster"when dealing with high-dimensional problems because of the discretization of state and control variables. For this problem, a new energy management algorithm based on deep reinforcement learning with normalized advantage function is proposed, where two deep neural networks with normalized advantage function are used to realize the continuous control of energy and eliminate the discretization of state and control variables. Based on the modeling of powertrain of a series hybrid tracked vehicle, the framework of the proposed deep reinforcement learning algorithm was built and the parameter update process was completed for the series hybrid tracked vehicle. The simulated results show that the proposed algorithm can output more refined control quantity and less output fluctuation. Compared with the deep Q-learning algorithm, the proposed algorithm improves the fuel economy of series hybrid tracked vehicle by 3.96%. In addition, the adaptability of the proposed algorithm and the optimized effect in real-time control environment are verified by the hardware-in-the-loop simulation.
AB - The energy management strategy based on reinforcement learning encounters the problem of "dimension disaster"when dealing with high-dimensional problems because of the discretization of state and control variables. For this problem, a new energy management algorithm based on deep reinforcement learning with normalized advantage function is proposed, where two deep neural networks with normalized advantage function are used to realize the continuous control of energy and eliminate the discretization of state and control variables. Based on the modeling of powertrain of a series hybrid tracked vehicle, the framework of the proposed deep reinforcement learning algorithm was built and the parameter update process was completed for the series hybrid tracked vehicle. The simulated results show that the proposed algorithm can output more refined control quantity and less output fluctuation. Compared with the deep Q-learning algorithm, the proposed algorithm improves the fuel economy of series hybrid tracked vehicle by 3.96%. In addition, the adaptability of the proposed algorithm and the optimized effect in real-time control environment are verified by the hardware-in-the-loop simulation.
KW - Continuous control
KW - Energy management strategy
KW - Hardware-in-the-loop simulation
KW - Normalized advantage function
KW - Series hybrid tracked vehicle
UR - http://www.scopus.com/inward/record.url?scp=85119067992&partnerID=8YFLogxK
U2 - 10.3969/j.issn.1000-1093.2021.10.011
DO - 10.3969/j.issn.1000-1093.2021.10.011
M3 - 文章
AN - SCOPUS:85119067992
SN - 1000-1093
VL - 42
SP - 2159
EP - 2169
JO - Binggong Xuebao/Acta Armamentarii
JF - Binggong Xuebao/Acta Armamentarii
IS - 10
ER -