A Multi-Level Semantic Fusion VoteNet for 3D Object Detection on Point Clouds

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In this paper, a Multi-Level Semantic Fusion VoteNet (MLSFVNet) is proposed to detect objects in 3D scenes. The method works on 3D point clouds captured by RGB-D camera, which can provide abundant and precise distance information of environments. The proposed method consists of three modules: the multi-level semantics fusion network, voting operation and proposal generator. To overcome the lack of semantic information, the multi-level semantics fusion network is proposed to capture the multi-level features. To predict the object centers, the voting operation is used to map the features into a feature space of the same scale and regress the object centers. The proposal generator is used to generate proposals and then predict the bounding boxes. MLSFVNet is evaluated on the popular indoor datasets SUN RGB-D and ScanNetV2. The experimental results demonstrate that the MLSFVNet proposed in this paper is an effective way to promote detection accuracy: 58.1% mAP on SUN RGB-D and 59.8% mAP on ScanNetV2.

Original languageEnglish
Title of host publicationProceeding - 2021 China Automation Congress, CAC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4514-4519
Number of pages6
ISBN (Electronic)9781665426473
DOIs
Publication statusPublished - 2021
Event2021 China Automation Congress, CAC 2021 - Beijing, China
Duration: 22 Oct 202124 Oct 2021

Publication series

NameProceeding - 2021 China Automation Congress, CAC 2021

Conference

Conference2021 China Automation Congress, CAC 2021
Country/TerritoryChina
CityBeijing
Period22/10/2124/10/21

Keywords

  • 3D object detection
  • computer vision
  • deep learning
  • multi-level semantics
  • point clouds

Fingerprint

Dive into the research topics of 'A Multi-Level Semantic Fusion VoteNet for 3D Object Detection on Point Clouds'. Together they form a unique fingerprint.

Cite this