一种多层多模态融合 3D 目标检测方法

Translated title of the contribution: 3D Object Detection Based on Multilayer Multimodal Fusion

Zhi Guo Zhou, Wen Hao Ma

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Camera and lidar are the key sources of information in autonomous vehicles (AVs). However, in the current 3D object detection tasks, most of the pure point cloud network detection capabilities are better than those of image and laser point cloud fusion networks. Existing studies summarize the reasons for this as the misalignment of view between image and radar information and the difficulty of matching heterogeneous features. Single-stage fusion algorithm is difficult to fully fuse the features of both. For this reason, a nova 3D object detection based on multilayer multimodal fusion (3DMMF) is presented. First, in the early-fusion phase, point clouds are encoded locally by Frustum-RGB-PointPainting (FRP) formed by the 2D detection frame. Then, the encoded point cloud input is combined with the self-attention mechanism context-aware channel to expand the PointPillars detection network. In the later-fusion phase, 2D and 3D candidate boxes are coded as two sets of sparse tensors before they are not greatly suppressed, and the final 3D target detection result is obtained by using the camera lidar object candidates fusion (CLOCs) network. Experiments on KITTI datasets show that this fusion detection method has a significant performance improvement over the baseline of pure point cloud networks, with an average mAP improvement of 6.24%.

Translated title of the contribution3D Object Detection Based on Multilayer Multimodal Fusion
Original languageChinese (Traditional)
Pages (from-to)696-708
Number of pages13
JournalTien Tzu Hsueh Pao/Acta Electronica Sinica
Volume52
Issue number3
DOIs
Publication statusPublished - Mar 2024

Fingerprint

Dive into the research topics of '3D Object Detection Based on Multilayer Multimodal Fusion'. Together they form a unique fingerprint.

Cite this

Zhou, Z. G., & Ma, W. H. (2024). 一种多层多模态融合 3D 目标检测方法. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 52(3), 696-708. https://doi.org/10.12263/DZXB.20220593