Voxel Transformer with Shifted Windows for 3D Object Detection

Chencheng Luo, Xiangzhou Wang, Ziling Zhao, Shuhua Zheng*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Recent three-dimensional object detection methods are typically classified into point-based and voxel-based categories based on the processing method of raw point clouds. Voxel-based methods, which convert the point clouds to voxels to reduce computational load, often suffer from the geometric information loss and limited detection accuracy. In this paper, we propose a novel single-stage and voxel-based 3D object detection algorithm (VWTr) using Voxel Feature Encoder to extract features and Transformer Backbone with shifted windows to enhance the capability of feature extraction, which achieves a balance between accuracy and speed. The Transformer Backbone with shifted windows can help the network efficiently concentrate on global information and make up for the geometric information loss arose from the voxelization operation of the voxel feature encoder. To this end, we design a feature aggregation operation to enhance the network's representation capability. Relevant experiments on KITTI have demonstrated that our method has respectively reached 84.11%, 75.18%, 69.53%

源语言英语
主期刊名Proceedings - 2023 China Automation Congress, CAC 2023
出版商Institute of Electrical and Electronics Engineers Inc.
2717-2721
页数5
ISBN(电子版)9798350303759
DOI
出版状态已出版 - 2023
活动2023 China Automation Congress, CAC 2023 - Chongqing, 中国
期限: 17 11月 202319 11月 2023

出版系列

姓名Proceedings - 2023 China Automation Congress, CAC 2023

会议

会议2023 China Automation Congress, CAC 2023
国家/地区中国
Chongqing
时期17/11/2319/11/23

指纹

探究 'Voxel Transformer with Shifted Windows for 3D Object Detection' 的科研主题。它们共同构成独一无二的指纹。

引用此