Abstract
Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to com-plete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution cal-culations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the com-monly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.
Original language | English |
---|---|
Pages (from-to) | 1145-1153 |
Number of pages | 9 |
Journal | CAAI Transactions on Intelligent Systems |
Volume | 17 |
Issue number | 6 |
DOIs | |
Publication status | Published - Nov 2022 |
Keywords
- binocular vision
- deep learning
- disparity estimation
- dynamic computation
- feature fusion
- on-board vision
- stereo matching