Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception

Li Wang; Ruifeng Li; Hezi Shi; Jingwen Sun; Lijun Zhao; Hock Soon Seah; Chee Kwang Quah; Budianto Tandianus

doi:10.3390/s19040893

Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception

Li Wang, Ruifeng Li^*, Hezi Shi, Jingwen Sun, Lijun Zhao, Hock Soon Seah, Chee Kwang Quah, Budianto Tandianus

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

21 引用（Scopus）

摘要

Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot’s operation. In this paper, we focus on the 3D object detection to regress the object’s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird’s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot.

源语言	英语
文章编号	893
期刊	Sensors
卷	19
期	4
DOI	https://doi.org/10.3390/s19040893
出版状态	已出版 - 2 2月 2019
已对外发布	是

访问文件

10.3390/s19040893

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d0de163163f044e6a8a5ab22d9f01aa0,

title = "Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception",

abstract = "Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot{\textquoteright}s operation. In this paper, we focus on the 3D object detection to regress the object{\textquoteright}s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird{\textquoteright}s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot.",

keywords = "3D object detection, Environmental perception, Indoor robot, Multi-channel cnn",

author = "Li Wang and Ruifeng Li and Hezi Shi and Jingwen Sun and Lijun Zhao and Seah, {Hock Soon} and Quah, {Chee Kwang} and Budianto Tandianus",

note = "Publisher Copyright: {\textcopyright} 2019 by the author. Licensee MDPI, Basel, Switzerland.",

year = "2019",

month = feb,

day = "2",

doi = "10.3390/s19040893",

language = "English",

volume = "19",

journal = "Sensors",

issn = "1424-8220",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "4",

}

TY - JOUR

T1 - Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception

AU - Wang, Li

AU - Li, Ruifeng

AU - Shi, Hezi

AU - Sun, Jingwen

AU - Zhao, Lijun

AU - Seah, Hock Soon

AU - Quah, Chee Kwang

AU - Tandianus, Budianto

PY - 2019/2/2

Y1 - 2019/2/2

N2 - Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot’s operation. In this paper, we focus on the 3D object detection to regress the object’s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird’s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot.

AB - Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot’s operation. In this paper, we focus on the 3D object detection to regress the object’s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird’s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot.

KW - 3D object detection

KW - Environmental perception

KW - Indoor robot

KW - Multi-channel cnn

UR - http://www.scopus.com/inward/record.url?scp=85062030705&partnerID=8YFLogxK

U2 - 10.3390/s19040893

DO - 10.3390/s19040893

M3 - Article

C2 - 30795507

AN - SCOPUS:85062030705

SN - 1424-8220

VL - 19

JO - Sensors

JF - Sensors

IS - 4

M1 - 893

ER -

Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception

摘要

访问文件

其它文件与链接

指纹

引用此