Large-scale 3D Semantic Mapping Using Stereo Vision

Yi Yang; Fan Qiu; Hao Li; Lu Zhang; Mei Ling Wang; Meng Yin Fu

doi:10.1007/s11633-018-1118-y

Large-scale 3D Semantic Mapping Using Stereo Vision

Yi Yang^*, Fan Qiu, Hao Li, Lu Zhang, Mei Ling Wang, Meng Yin Fu

^*此作品的通讯作者

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

18 引用（Scopus）

摘要

In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and the motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify the feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate the dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.

源语言	英语
页（从-至）	194-206
页数	13
期刊	International Journal of Automation and Computing
卷	15
期	2
DOI	https://doi.org/10.1007/s11633-018-1118-y
出版状态	已出版 - 1 4月 2018

访问文件

10.1007/s11633-018-1118-y

其它文件与链接

链接到 Scopus 的出版物

引用此

Yang, Y., Qiu, F., Li, H., Zhang, L., Wang, M. L., & Fu, M. Y. (2018). Large-scale 3D Semantic Mapping Using Stereo Vision. International Journal of Automation and Computing, 15(2), 194-206. https://doi.org/10.1007/s11633-018-1118-y

@article{4a988646660e4b83986a7ded6a36cfc1,

title = "Large-scale 3D Semantic Mapping Using Stereo Vision",

abstract = "In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and the motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify the feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate the dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.",

keywords = "Semantic map, motion segmentation, simultaneous localization and mapping (SLAM), stereo vision, visual odometry",

author = "Yi Yang and Fan Qiu and Hao Li and Lu Zhang and Wang, {Mei Ling} and Fu, {Meng Yin}",

note = "Publisher Copyright: {\textcopyright} 2018, Institute of Automation, Chinese Academy of Sciences and Springer-Verlag GmbH Germany, part of Springer Nature.",

year = "2018",

month = apr,

day = "1",

doi = "10.1007/s11633-018-1118-y",

language = "English",

volume = "15",

pages = "194--206",

journal = "International Journal of Automation and Computing",

issn = "1476-8186",

publisher = "Chinese Academy of Sciences",

number = "2",

}

TY - JOUR

T1 - Large-scale 3D Semantic Mapping Using Stereo Vision

AU - Yang, Yi

AU - Qiu, Fan

AU - Li, Hao

AU - Zhang, Lu

AU - Wang, Mei Ling

AU - Fu, Meng Yin

PY - 2018/4/1

Y1 - 2018/4/1

N2 - In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and the motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify the feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate the dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.

AB - In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and the motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify the feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate the dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.

KW - Semantic map

KW - motion segmentation

KW - simultaneous localization and mapping (SLAM)

KW - stereo vision

KW - visual odometry

UR - http://www.scopus.com/inward/record.url?scp=85043374623&partnerID=8YFLogxK

U2 - 10.1007/s11633-018-1118-y

DO - 10.1007/s11633-018-1118-y

M3 - Article

AN - SCOPUS:85043374623

SN - 1476-8186

VL - 15

SP - 194

EP - 206

JO - International Journal of Automation and Computing

JF - International Journal of Automation and Computing

IS - 2

ER -

Large-scale 3D Semantic Mapping Using Stereo Vision

摘要

访问文件

其它文件与链接

指纹

引用此