Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction

Guohao He; Yong Zhai; Jianwei Gong; Yuchun Wang; Xi Zhang

doi:10.11992/tis.202111013

Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction

Guohao He, Yong Zhai, Jianwei Gong^*, Yuchun Wang, Xi Zhang

^*此作品的通讯作者

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to com-plete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution cal-culations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the com-monly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.

源语言	英语
页（从-至）	1145-1153
页数	9
期刊	CAAI Transactions on Intelligent Systems
卷	17
期	6
DOI	https://doi.org/10.11992/tis.202111013
出版状态	已出版 - 11月 2022

访问文件

10.11992/tis.202111013

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{48b338f55f2842708c9b5a57a2660c83,

title = "Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction",

abstract = "Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to com-plete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution cal-culations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the com-monly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.",

keywords = "binocular vision, deep learning, disparity estimation, dynamic computation, feature fusion, on-board vision, stereo matching",

author = "Guohao He and Yong Zhai and Jianwei Gong and Yuchun Wang and Xi Zhang",

year = "2022",

month = nov,

doi = "10.11992/tis.202111013",

language = "English",

volume = "17",

pages = "1145--1153",

journal = "CAAI Transactions on Intelligent Systems",

issn = "1673-4785",

publisher = "Editorial Department of CAAI Transactions on Intelligent Systems",

number = "6",

}

TY - JOUR

T1 - Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction

AU - He, Guohao

AU - Zhai, Yong

AU - Gong, Jianwei

AU - Wang, Yuchun

AU - Zhang, Xi

PY - 2022/11

Y1 - 2022/11

N2 - Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to com-plete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution cal-culations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the com-monly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.

AB - Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to com-plete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution cal-culations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the com-monly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.

KW - binocular vision

KW - deep learning

KW - disparity estimation

KW - dynamic computation

KW - feature fusion

KW - on-board vision

KW - stereo matching

UR - http://www.scopus.com/inward/record.url?scp=85168383404&partnerID=8YFLogxK

U2 - 10.11992/tis.202111013

DO - 10.11992/tis.202111013

M3 - Article

AN - SCOPUS:85168383404

SN - 1673-4785

VL - 17

SP - 1145

EP - 1153

JO - CAAI Transactions on Intelligent Systems

JF - CAAI Transactions on Intelligent Systems

IS - 6

ER -

Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction

摘要

访问文件

其它文件与链接

指纹

引用此