TY - GEN
T1 - Online imitation learning for self-driving simulation
AU - Zhang, Zhe
AU - Zhao, Sanyuan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/8/17
Y1 - 2021/8/17
N2 - The end-to-end autonomous driving policy has made great progress with the development of deep learning. The current methods are mainly divided into imitation learning and reinforcement learning. The method of imitation learning can quickly realize the one-to-one correspondence between states and actions, but is limited by the dataset and is prone to overfitting. Therefore, the current methods mainly focus on extracting more robust input state features and proposing a more generalized dataset. Reinforcement learning methods can obtain richer input states due to online training, but at the same time requires longer training time, so current methods mainly focus on reducing training time and designing appropriate rewards. In this paper, we propose an end-to-end temporal convolution model based on segmentation medium, which uses online imitation learning to obtain richer input states, train more robust policy networks. At the same time, to reduce the training time, we use our own designed segmentation medium to replace the raw sensor information as the input of the policy network. Experiments on the CARLA driving benchmarks show that our approach achieves satisfactory results and has excellent generalization ability.
AB - The end-to-end autonomous driving policy has made great progress with the development of deep learning. The current methods are mainly divided into imitation learning and reinforcement learning. The method of imitation learning can quickly realize the one-to-one correspondence between states and actions, but is limited by the dataset and is prone to overfitting. Therefore, the current methods mainly focus on extracting more robust input state features and proposing a more generalized dataset. Reinforcement learning methods can obtain richer input states due to online training, but at the same time requires longer training time, so current methods mainly focus on reducing training time and designing appropriate rewards. In this paper, we propose an end-to-end temporal convolution model based on segmentation medium, which uses online imitation learning to obtain richer input states, train more robust policy networks. At the same time, to reduce the training time, we use our own designed segmentation medium to replace the raw sensor information as the input of the policy network. Experiments on the CARLA driving benchmarks show that our approach achieves satisfactory results and has excellent generalization ability.
KW - Autonomous driving
KW - Online imitation learning
KW - Segmentation medium
UR - http://www.scopus.com/inward/record.url?scp=85118954584&partnerID=8YFLogxK
U2 - 10.1109/ICCSE51940.2021.9569543
DO - 10.1109/ICCSE51940.2021.9569543
M3 - Conference contribution
AN - SCOPUS:85118954584
T3 - ICCSE 2021 - IEEE 16th International Conference on Computer Science and Education
SP - 810
EP - 815
BT - ICCSE 2021 - IEEE 16th International Conference on Computer Science and Education
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th IEEE International Conference on Computer Science and Education, ICCSE 2021
Y2 - 17 August 2021 through 21 August 2021
ER -