Abstract
Deep learning techniques have been widely used in autonomous driving community for the purpose of environment perception. Recently, it starts being adopted for learning end-to-end controllers for complex driving scenarios. However, the complexity and nonlinearity of the network architecture limits its interpretability to understand driving scenarios and judge the importance of certain visual regions in sensory scenes. In this paper, based on the convolutional neural network (CNN), we propose two complementary frameworks to automatically determine the most contributive regions of the input scenes, offering intuitive knowledge of how a trained end-to-end autonomous vehicle controller understands driving scenarios. In the first framework, a feature map-based method is proposed by leveraging current progress in CNN visualization, in which the deconvolution approach recovers the feature maps to extract features that contribute most to understand driving scenes. In the second framework, the importance level of regions is ranked using the error map between the labeled and predicted control inputs generated by occluding different parts of input scenes, thus providing a pixel-wise rank of importance. Test data sets with extracted contributive regions are input to the CNN controller. Then, different CNN controllers trained with the new data sets preprocessed using our proposed frameworks are verified via closed-loop tests. Results show that both the features identified from the first framework and the regions identified from the second framework are of crucial importance to scene understanding for the controller and can significantly affect the performance of CNN controllers.
Original language | English |
---|---|
Article number | 8480450 |
Pages (from-to) | 53-63 |
Number of pages | 11 |
Journal | IEEE Transactions on Systems, Man, and Cybernetics: Systems |
Volume | 49 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jan 2019 |
Externally published | Yes |
Keywords
- Autonomous vehicles
- convolutional neural network (CNN)
- scene understanding