TY - JOUR
T1 - SAT-GCN
T2 - Self-attention graph convolutional network-based 3D object detection for autonomous driving
AU - Wang, Li
AU - Song, Ziying
AU - Zhang, Xinyu
AU - Wang, Chenfei
AU - Zhang, Guoxin
AU - Zhu, Lei
AU - Li, Jun
AU - Liu, Huaping
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2023/1/10
Y1 - 2023/1/10
N2 - Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.
AB - Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.
KW - 3D object detection
KW - Graph convolutional network
KW - Self-attention mechanism
UR - http://www.scopus.com/inward/record.url?scp=85142323930&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2022.110080
DO - 10.1016/j.knosys.2022.110080
M3 - Article
AN - SCOPUS:85142323930
SN - 0950-7051
VL - 259
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 110080
ER -