Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther

Zhiyu Liao; Ying Jin; Hongbin Ma; Abdulrahman Alsumeri

doi:10.23919/CCC58697.2023.10240474

Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther

Zhiyu Liao, Ying Jin, Hongbin Ma, Abdulrahman Alsumeri

School of Automation

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Point cloud is the most widely used data input for modern 3D object detection methods, however, due to the complexity of the environment in which data is collected, it suffers from inevitable loss of information, which is extremely serious for distant objects. In this paper, we improved the backbone of voxel-based 3D object detection methods, which help to better detect distant targets. The improvements we proposed can help to process raw points in different resolutions according to its distance to lidar. Specifically, the farther point is away from lidar, the finer feature extraction and aggression conduction we will adopt. Our purpose is to find a balance between information loss and memory consumption. Moreover, inspired by the success of Transformer in computer vision, we adopted Multi-headed Self-attention(MHSA) structure to our modified backbone. MHSA offers the ability of global receptive field, which helps to get more informative Bird's Eye View(BEV) representation of the point cloud. Our modifications are plug and play, and can be used in any 3D object detection method based on voxels and sparse 3D convolution. We evaluated the performance of our modifications in KITTI, experiments demonstrate the effectiveness of our efforts.

Original language	English
Title of host publication	2023 42nd Chinese Control Conference, CCC 2023
Publisher	IEEE Computer Society
Pages	7995-8000
Number of pages	6
ISBN (Electronic)	9789887581543
DOIs	https://doi.org/10.23919/CCC58697.2023.10240474
Publication status	Published - 2023
Event	42nd Chinese Control Conference, CCC 2023 - Tianjin, China Duration: 24 Jul 2023 → 26 Jul 2023

Publication series

Name	Chinese Control Conference, CCC
Volume	2023-July
ISSN (Print)	1934-1768
ISSN (Electronic)	2161-2927

Conference

Conference	42nd Chinese Control Conference, CCC 2023
Country/Territory	China
City	Tianjin
Period	24/07/23 → 26/07/23

Keywords

3D object detection
Multi-headed Self-attention
distant targets
plug and play
voxel-based

Access to Document

10.23919/CCC58697.2023.10240474

Cite this

@inproceedings{b9d4c5800cbb4ee18de36082074dcfd3,

title = "Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther",

abstract = "Point cloud is the most widely used data input for modern 3D object detection methods, however, due to the complexity of the environment in which data is collected, it suffers from inevitable loss of information, which is extremely serious for distant objects. In this paper, we improved the backbone of voxel-based 3D object detection methods, which help to better detect distant targets. The improvements we proposed can help to process raw points in different resolutions according to its distance to lidar. Specifically, the farther point is away from lidar, the finer feature extraction and aggression conduction we will adopt. Our purpose is to find a balance between information loss and memory consumption. Moreover, inspired by the success of Transformer in computer vision, we adopted Multi-headed Self-attention(MHSA) structure to our modified backbone. MHSA offers the ability of global receptive field, which helps to get more informative Bird's Eye View(BEV) representation of the point cloud. Our modifications are plug and play, and can be used in any 3D object detection method based on voxels and sparse 3D convolution. We evaluated the performance of our modifications in KITTI, experiments demonstrate the effectiveness of our efforts.",

keywords = "3D object detection, Multi-headed Self-attention, distant targets, plug and play, voxel-based",

author = "Zhiyu Liao and Ying Jin and Hongbin Ma and Abdulrahman Alsumeri",

note = "Publisher Copyright: {\textcopyright} 2023 Technical Committee on Control Theory, Chinese Association of Automation.; 42nd Chinese Control Conference, CCC 2023 ; Conference date: 24-07-2023 Through 26-07-2023",

year = "2023",

doi = "10.23919/CCC58697.2023.10240474",

language = "English",

series = "Chinese Control Conference, CCC",

publisher = "IEEE Computer Society",

pages = "7995--8000",

booktitle = "2023 42nd Chinese Control Conference, CCC 2023",

address = "United States",

}

Liao, Z, Jin, Y , Ma, H & Alsumeri, A 2023, Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther. in 2023 42nd Chinese Control Conference, CCC 2023. Chinese Control Conference, CCC, vol. 2023-July, IEEE Computer Society, pp. 7995-8000, 42nd Chinese Control Conference, CCC 2023, Tianjin, China, 24/07/23. https://doi.org/10.23919/CCC58697.2023.10240474

Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther. / Liao, Zhiyu; Jin, Ying ; Ma, Hongbin et al.
2023 42nd Chinese Control Conference, CCC 2023. IEEE Computer Society, 2023. p. 7995-8000 (Chinese Control Conference, CCC; Vol. 2023-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther

AU - Liao, Zhiyu

AU - Jin, Ying

AU - Ma, Hongbin

AU - Alsumeri, Abdulrahman

PY - 2023

Y1 - 2023

N2 - Point cloud is the most widely used data input for modern 3D object detection methods, however, due to the complexity of the environment in which data is collected, it suffers from inevitable loss of information, which is extremely serious for distant objects. In this paper, we improved the backbone of voxel-based 3D object detection methods, which help to better detect distant targets. The improvements we proposed can help to process raw points in different resolutions according to its distance to lidar. Specifically, the farther point is away from lidar, the finer feature extraction and aggression conduction we will adopt. Our purpose is to find a balance between information loss and memory consumption. Moreover, inspired by the success of Transformer in computer vision, we adopted Multi-headed Self-attention(MHSA) structure to our modified backbone. MHSA offers the ability of global receptive field, which helps to get more informative Bird's Eye View(BEV) representation of the point cloud. Our modifications are plug and play, and can be used in any 3D object detection method based on voxels and sparse 3D convolution. We evaluated the performance of our modifications in KITTI, experiments demonstrate the effectiveness of our efforts.

AB - Point cloud is the most widely used data input for modern 3D object detection methods, however, due to the complexity of the environment in which data is collected, it suffers from inevitable loss of information, which is extremely serious for distant objects. In this paper, we improved the backbone of voxel-based 3D object detection methods, which help to better detect distant targets. The improvements we proposed can help to process raw points in different resolutions according to its distance to lidar. Specifically, the farther point is away from lidar, the finer feature extraction and aggression conduction we will adopt. Our purpose is to find a balance between information loss and memory consumption. Moreover, inspired by the success of Transformer in computer vision, we adopted Multi-headed Self-attention(MHSA) structure to our modified backbone. MHSA offers the ability of global receptive field, which helps to get more informative Bird's Eye View(BEV) representation of the point cloud. Our modifications are plug and play, and can be used in any 3D object detection method based on voxels and sparse 3D convolution. We evaluated the performance of our modifications in KITTI, experiments demonstrate the effectiveness of our efforts.

KW - 3D object detection

KW - Multi-headed Self-attention

KW - distant targets

KW - plug and play

KW - voxel-based

UR - http://www.scopus.com/inward/record.url?scp=85175547668&partnerID=8YFLogxK

U2 - 10.23919/CCC58697.2023.10240474

DO - 10.23919/CCC58697.2023.10240474

M3 - Conference contribution

AN - SCOPUS:85175547668

T3 - Chinese Control Conference, CCC

SP - 7995

EP - 8000

BT - 2023 42nd Chinese Control Conference, CCC 2023

PB - IEEE Computer Society

T2 - 42nd Chinese Control Conference, CCC 2023

Y2 - 24 July 2023 through 26 July 2023

ER -

Distance Awared:Adaptive Voxel Resolution to help 3D Object Detection Networks See Farther

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this