TY - JOUR
T1 - Scene understanding in deep learning-based end-to-end controllers for autonomous vehicles
AU - Yang, Shun
AU - Wang, Wenshuo
AU - Liu, Chang
AU - Deng, Weiwen
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2019/1
Y1 - 2019/1
N2 - Deep learning techniques have been widely used in autonomous driving community for the purpose of environment perception. Recently, it starts being adopted for learning end-to-end controllers for complex driving scenarios. However, the complexity and nonlinearity of the network architecture limits its interpretability to understand driving scenarios and judge the importance of certain visual regions in sensory scenes. In this paper, based on the convolutional neural network (CNN), we propose two complementary frameworks to automatically determine the most contributive regions of the input scenes, offering intuitive knowledge of how a trained end-to-end autonomous vehicle controller understands driving scenarios. In the first framework, a feature map-based method is proposed by leveraging current progress in CNN visualization, in which the deconvolution approach recovers the feature maps to extract features that contribute most to understand driving scenes. In the second framework, the importance level of regions is ranked using the error map between the labeled and predicted control inputs generated by occluding different parts of input scenes, thus providing a pixel-wise rank of importance. Test data sets with extracted contributive regions are input to the CNN controller. Then, different CNN controllers trained with the new data sets preprocessed using our proposed frameworks are verified via closed-loop tests. Results show that both the features identified from the first framework and the regions identified from the second framework are of crucial importance to scene understanding for the controller and can significantly affect the performance of CNN controllers.
AB - Deep learning techniques have been widely used in autonomous driving community for the purpose of environment perception. Recently, it starts being adopted for learning end-to-end controllers for complex driving scenarios. However, the complexity and nonlinearity of the network architecture limits its interpretability to understand driving scenarios and judge the importance of certain visual regions in sensory scenes. In this paper, based on the convolutional neural network (CNN), we propose two complementary frameworks to automatically determine the most contributive regions of the input scenes, offering intuitive knowledge of how a trained end-to-end autonomous vehicle controller understands driving scenarios. In the first framework, a feature map-based method is proposed by leveraging current progress in CNN visualization, in which the deconvolution approach recovers the feature maps to extract features that contribute most to understand driving scenes. In the second framework, the importance level of regions is ranked using the error map between the labeled and predicted control inputs generated by occluding different parts of input scenes, thus providing a pixel-wise rank of importance. Test data sets with extracted contributive regions are input to the CNN controller. Then, different CNN controllers trained with the new data sets preprocessed using our proposed frameworks are verified via closed-loop tests. Results show that both the features identified from the first framework and the regions identified from the second framework are of crucial importance to scene understanding for the controller and can significantly affect the performance of CNN controllers.
KW - Autonomous vehicles
KW - convolutional neural network (CNN)
KW - scene understanding
UR - http://www.scopus.com/inward/record.url?scp=85054552728&partnerID=8YFLogxK
U2 - 10.1109/TSMC.2018.2868372
DO - 10.1109/TSMC.2018.2868372
M3 - Article
AN - SCOPUS:85054552728
SN - 2168-2216
VL - 49
SP - 53
EP - 63
JO - IEEE Transactions on Systems, Man, and Cybernetics: Systems
JF - IEEE Transactions on Systems, Man, and Cybernetics: Systems
IS - 1
M1 - 8480450
ER -