TY - GEN
T1 - Multi-Objective Energy Management Strategy for Fuel Cell Trucks Using DP-Pretrained SAC
AU - Zhou, Zhiqiang
AU - He, Hongwen
AU - Wu, Jingda
AU - Li, Kunang
AU - Huang, Ruchen
AU - Li, Jiaqi
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In the development of fuel cell hybrid electric trucks, the energy management strategy (EMS) plays a key role in enhancing system efficiency and operational economy. The objectives of the EMS include not only minimizing hydrogen consumption but also extending the powertrain system's lifespan to support long-term sustainable commercial operation. This paper proposes a multi-objective energy management framework that integrates Dynamic Programming (DP) and the Soft ActorCritic (SAC) algorithm. Specifically, DP is utilized offline to generate optimal energy allocation trajectories, which are then used to initialize the SAC replay buffer and thereby warm start the training process. Furthermore, a multi-objective reward function is designed to jointly optimize hydrogen consumption and the degradation of both the fuel cell and lithium-ion battery systems. Simulation results show that the proposed strategy incurs only 2.86% higher total cost compared to the DP baseline, while achieving 5.11% and 23.74% lower costs than the standard SAC and rule-based strategies, respectively. In addition, the training time is reduced by 50.64% compared to the standard SAC method, confirming the effectiveness of the pretrained SAC approach in accelerating convergence and delivering highperformance energy management.
AB - In the development of fuel cell hybrid electric trucks, the energy management strategy (EMS) plays a key role in enhancing system efficiency and operational economy. The objectives of the EMS include not only minimizing hydrogen consumption but also extending the powertrain system's lifespan to support long-term sustainable commercial operation. This paper proposes a multi-objective energy management framework that integrates Dynamic Programming (DP) and the Soft ActorCritic (SAC) algorithm. Specifically, DP is utilized offline to generate optimal energy allocation trajectories, which are then used to initialize the SAC replay buffer and thereby warm start the training process. Furthermore, a multi-objective reward function is designed to jointly optimize hydrogen consumption and the degradation of both the fuel cell and lithium-ion battery systems. Simulation results show that the proposed strategy incurs only 2.86% higher total cost compared to the DP baseline, while achieving 5.11% and 23.74% lower costs than the standard SAC and rule-based strategies, respectively. In addition, the training time is reduced by 50.64% compared to the standard SAC method, confirming the effectiveness of the pretrained SAC approach in accelerating convergence and delivering highperformance energy management.
KW - DP-pretrained
KW - Energy management strategy
KW - Fuel cell trucks
KW - Multi-objective optimization
KW - Soft actor-critic algorithm
UR - https://www.scopus.com/pages/publications/105038005594
U2 - 10.1109/ICEPG67373.2025.11466710
DO - 10.1109/ICEPG67373.2025.11466710
M3 - Conference contribution
AN - SCOPUS:105038005594
T3 - 2025 IEEE 7th International Conference on Energy, Power and Grid, ICEPG 2025
SP - 848
EP - 853
BT - 2025 IEEE 7th International Conference on Energy, Power and Grid, ICEPG 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE 7th International Conference on Energy, Power and Grid, ICEPG 2025
Y2 - 12 September 2025 through 14 September 2025
ER -