Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving

Siyuan Chen; Meiling Wang; Wenjie Song; Yi Yang; Yujun Li; Mengyin Fu

doi:10.1109/TVT.2020.2979493

Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving

Siyuan Chen, Meiling Wang, Wenjie Song^*, Yi Yang, Yujun Li, Mengyin Fu

^*Corresponding author for this work

School of Automation

Research output: Contribution to journal › Article › peer-review

60 Citations (Scopus)

Abstract

Deep reinforcement learning (DRL) has been successfully applied to end-to-end autonomous driving, especially in simulation environments. However, common DRL approaches used in complex autonomous driving scenarios sometimes are unstable or difficult to converge. This paper proposes two approaches to improve the stability of the policy model training with as few manual data as possible. For the first approach, reinforcement learning is combined with imitation learning to train a feature network with a small amount of manual data for parameters initialization. For the second approach, an auxiliary network is added to the reinforcement learning framework, which can leverage the real-time measurement information to deepen the understanding of environment, without any guide of demonstrators. To verify the effectiveness of these two approaches, simulations in image information-based and lidar information-based end-to-end autonomous driving systems are conducted, respectively. These approaches are not only tested in the virtual game world, but also applied in Gazebo, in which we build a 3D world based on the real vehicle model of Ranger XP900 platform, the real 3D obstacle model, and the real motion constraints with inertial characteristics, so as to ensure that the trained end-to-end autonomous driving model is more suitable for the real world. Experimental results show that the performance is increased by over 45% in the virtual game world, and can converge quickly and stably in Gazebo in which previous methods can hardly converge.

Original language	English
Article number	9028159
Pages (from-to)	4740-4750
Number of pages	11
Journal	IEEE Transactions on Vehicular Technology
Volume	69
Issue number	5
DOIs	https://doi.org/10.1109/TVT.2020.2979493
Publication status	Published - May 2020

Keywords

Deep reinforcement learning
autonomous driving
end-to-end
stabilization

Access to Document

10.1109/TVT.2020.2979493

Cite this

Chen, S., Wang, M., Song, W., Yang, Y., Li, Y., & Fu, M. (2020). Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving. IEEE Transactions on Vehicular Technology, 69(5), 4740-4750. Article 9028159. https://doi.org/10.1109/TVT.2020.2979493

@article{daf8eee2e786480c8d6f644d59a46307,

title = "Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving",

abstract = "Deep reinforcement learning (DRL) has been successfully applied to end-to-end autonomous driving, especially in simulation environments. However, common DRL approaches used in complex autonomous driving scenarios sometimes are unstable or difficult to converge. This paper proposes two approaches to improve the stability of the policy model training with as few manual data as possible. For the first approach, reinforcement learning is combined with imitation learning to train a feature network with a small amount of manual data for parameters initialization. For the second approach, an auxiliary network is added to the reinforcement learning framework, which can leverage the real-time measurement information to deepen the understanding of environment, without any guide of demonstrators. To verify the effectiveness of these two approaches, simulations in image information-based and lidar information-based end-to-end autonomous driving systems are conducted, respectively. These approaches are not only tested in the virtual game world, but also applied in Gazebo, in which we build a 3D world based on the real vehicle model of Ranger XP900 platform, the real 3D obstacle model, and the real motion constraints with inertial characteristics, so as to ensure that the trained end-to-end autonomous driving model is more suitable for the real world. Experimental results show that the performance is increased by over 45% in the virtual game world, and can converge quickly and stably in Gazebo in which previous methods can hardly converge.",

keywords = "Deep reinforcement learning, autonomous driving, end-to-end, stabilization",

author = "Siyuan Chen and Meiling Wang and Wenjie Song and Yi Yang and Yujun Li and Mengyin Fu",

note = "Publisher Copyright: {\textcopyright} 1967-2012 IEEE.",

year = "2020",

month = may,

doi = "10.1109/TVT.2020.2979493",

language = "English",

volume = "69",

pages = "4740--4750",

journal = "IEEE Transactions on Vehicular Technology",

issn = "0018-9545",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "5",

}

TY - JOUR

T1 - Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving

AU - Chen, Siyuan

AU - Wang, Meiling

AU - Song, Wenjie

AU - Yang, Yi

AU - Li, Yujun

AU - Fu, Mengyin

PY - 2020/5

Y1 - 2020/5

N2 - Deep reinforcement learning (DRL) has been successfully applied to end-to-end autonomous driving, especially in simulation environments. However, common DRL approaches used in complex autonomous driving scenarios sometimes are unstable or difficult to converge. This paper proposes two approaches to improve the stability of the policy model training with as few manual data as possible. For the first approach, reinforcement learning is combined with imitation learning to train a feature network with a small amount of manual data for parameters initialization. For the second approach, an auxiliary network is added to the reinforcement learning framework, which can leverage the real-time measurement information to deepen the understanding of environment, without any guide of demonstrators. To verify the effectiveness of these two approaches, simulations in image information-based and lidar information-based end-to-end autonomous driving systems are conducted, respectively. These approaches are not only tested in the virtual game world, but also applied in Gazebo, in which we build a 3D world based on the real vehicle model of Ranger XP900 platform, the real 3D obstacle model, and the real motion constraints with inertial characteristics, so as to ensure that the trained end-to-end autonomous driving model is more suitable for the real world. Experimental results show that the performance is increased by over 45% in the virtual game world, and can converge quickly and stably in Gazebo in which previous methods can hardly converge.

AB - Deep reinforcement learning (DRL) has been successfully applied to end-to-end autonomous driving, especially in simulation environments. However, common DRL approaches used in complex autonomous driving scenarios sometimes are unstable or difficult to converge. This paper proposes two approaches to improve the stability of the policy model training with as few manual data as possible. For the first approach, reinforcement learning is combined with imitation learning to train a feature network with a small amount of manual data for parameters initialization. For the second approach, an auxiliary network is added to the reinforcement learning framework, which can leverage the real-time measurement information to deepen the understanding of environment, without any guide of demonstrators. To verify the effectiveness of these two approaches, simulations in image information-based and lidar information-based end-to-end autonomous driving systems are conducted, respectively. These approaches are not only tested in the virtual game world, but also applied in Gazebo, in which we build a 3D world based on the real vehicle model of Ranger XP900 platform, the real 3D obstacle model, and the real motion constraints with inertial characteristics, so as to ensure that the trained end-to-end autonomous driving model is more suitable for the real world. Experimental results show that the performance is increased by over 45% in the virtual game world, and can converge quickly and stably in Gazebo in which previous methods can hardly converge.

KW - Deep reinforcement learning

KW - autonomous driving

KW - end-to-end

KW - stabilization

UR - http://www.scopus.com/inward/record.url?scp=85085096312&partnerID=8YFLogxK

U2 - 10.1109/TVT.2020.2979493

DO - 10.1109/TVT.2020.2979493

M3 - Article

AN - SCOPUS:85085096312

SN - 0018-9545

VL - 69

SP - 4740

EP - 4750

JO - IEEE Transactions on Vehicular Technology

JF - IEEE Transactions on Vehicular Technology

IS - 5

M1 - 9028159

ER -

Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this