Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment

  • Jiaxin Chen
  • , Hong Shu*
  • , Xiaolin Tang*
  • , Teng Liu
  • , Weida Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

58 Citations (Scopus)

Abstract

Aiming at promoting the intelligent development of control technology for new energy vehicles and showing the outstanding advantages of deep reinforcement learning (DRL), this paper trained a VGG16-based road recognition convolutional neural network (CNN) at first. Lots of high-definition images of five typical roads are collected by the racing game Dust Rally 2.0, including dry asphalt, wet asphalt, snow, dry cobblestone, and wet cobblestone. Then, a time-varying driving environment model was established, involving driving images, road slope, longitudinal speed, and the number of passengers. Finally, a stereoscopic control network suitable for nine-dimensional state space and three-dimensional action space was built, and for parallel hybrid electric vehicles (HEVs) with the P3 structure, a deep q-network (DQN)-based energy management strategy (EMS) achieving multi-objective control was proposed, including the fine-tuning strategy of the motor speed to maintain the optimal slip rate during braking, the engine power control strategy and the continuously variable transmission (CVT) gear ratio control strategy. Simulation results show under the influence of some factors such as tree shade and image compression, the road recognition network has the highest accuracy for snow roads and wet asphalt roads. Three types of control strategies learned simultaneously by the stereoscopic control network not only maintain the near-optimal slip rate in the braking period but also achieve a fuel consumption of 4788.93 g, while dynamic programming (DP)-based EMS gets a fuel consumption of 4295.61 g. Moreover, even DP-based EMS only contains three states and two actions, the time consumed for DP-based EMS and DQN-based EMS to run the speed cycle of 3602s is about 4911s and 10s, respectively. Therefore, the optimization and real-time performance of DRL-based EMS can be guaranteed.

Original languageEnglish
Article number122123
JournalEnergy
Volume239
DOIs
Publication statusPublished - 15 Jan 2022

Keywords

  • Deep reinforcement learning
  • Energy management strategy
  • Hybrid electric vehicle
  • Multi-objective control network
  • Road recognition network

Fingerprint

Dive into the research topics of 'Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment'. Together they form a unique fingerprint.

Cite this