摘要
Local convolutional operation and global attention-based transformer operation can extract features of the point cloud from two different scales respectively, but few methods can combine them effectively. In this paper, we propose a novel point cloud feature extractor that has the advantages of both convolutional and transformer operation, which is named as CTpoint. Specifically, CTpoint is composed of two branches, where ”C” represents the convolutional branch and “T” represents the transformer branch. The convolutional branch is responsible for extracting the local feature from grouped neighbor points, and the transformer branch performs the attention operation on the entire point cloud to capture the global feature. To make the two branches communicate with each other, the Feature Transmission Element (FTE) is proposed for the alignment of the spatial-feature dimension and the semantic space between the local and global features, so that the extracted local and global features can be effectively fused and the two branches can coordinately learn expressive features. We utilize CTpoint to construct point cloud classification and segmentation networks and evaluate their performance in several public datasets. In addition, we visualize the feature learned by CTpoint. The experimental results show that the expressive features learned by CTpoint make the networks achieve the state of the art performance on the point cloud classification and segmentation tasks.
源语言 | 英语 |
---|---|
页(从-至) | 273-289 |
页数 | 17 |
期刊 | Neurocomputing |
卷 | 511 |
DOI | |
出版状态 | 已出版 - 28 10月 2022 |