SAT-GCN: Self-attention graph convolutional network-based 3D object detection for autonomous driving

Li Wang; Ziying Song; Xinyu Zhang; Chenfei Wang; Guoxin Zhang; Lei Zhu; Jun Li; Huaping Liu

doi:10.1016/j.knosys.2022.110080

SAT-GCN: Self-attention graph convolutional network-based 3D object detection for autonomous driving

Li Wang, Ziying Song, Xinyu Zhang^*, Chenfei Wang, Guoxin Zhang, Lei Zhu, Jun Li, Huaping Liu

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

59 引用（Scopus）

摘要

Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.

源语言	英语
文章编号	110080
期刊	Knowledge-Based Systems
卷	259
DOI	https://doi.org/10.1016/j.knosys.2022.110080
出版状态	已出版 - 10 1月 2023
已对外发布	是

访问文件

10.1016/j.knosys.2022.110080

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c8424f535a1840719d3f3caafa34f93d,

title = "SAT-GCN: Self-attention graph convolutional network-based 3D object detection for autonomous driving",

abstract = "Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.",

keywords = "3D object detection, Graph convolutional network, Self-attention mechanism",

author = "Li Wang and Ziying Song and Xinyu Zhang and Chenfei Wang and Guoxin Zhang and Lei Zhu and Jun Li and Huaping Liu",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier B.V.",

year = "2023",

month = jan,

day = "10",

doi = "10.1016/j.knosys.2022.110080",

language = "English",

volume = "259",

journal = "Knowledge-Based Systems",

issn = "0950-7051",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - SAT-GCN

T2 - Self-attention graph convolutional network-based 3D object detection for autonomous driving

AU - Wang, Li

AU - Song, Ziying

AU - Zhang, Xinyu

AU - Wang, Chenfei

AU - Zhang, Guoxin

AU - Zhu, Lei

AU - Li, Jun

AU - Liu, Huaping

PY - 2023/1/10

Y1 - 2023/1/10

N2 - Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.

AB - Accurate 3D object detection from point clouds is critical for autonomous vehicles. However, point cloud data collected by LiDAR sensors are inherently sparse, especially at long distances. In addition, most existing 3D object detectors extract local features and ignore interactions between features, producing weak semantic information that significantly limits detection performance. We propose a self-attention graph convolutional network (SAT-GCN), which utilizes a GCN and self-attention to enhance semantic representations by aggregating neighborhood information and focusing on vital relationships. SAT-GCN consists of three modules: vertex feature extraction (VFE), self-attention with dimension reduction (SADR), and far distance feature suppression (FDFS). VFE extracts neighboring relationships between features using GCN after encoding a raw point cloud. SADR performs further weight augmentation for crucial neighboring relationships through self-attention. FDFS suppresses meaningless edges formed by sparse point cloud distributions in remote areas and generates corresponding global features. Extensive experiments are conducted on the widely used KITTI and nuScenes 3D object detection benchmarks. The results demonstrate significant improvements in mainstream methods, PointPillars, SECOND, and PointRCNN, improving the mean of AP 3D by 4.88%, 5.02%, and 2.79% on KITTI test dataset. SAT-GCN can boost the detection accuracy of the point cloud, especially at medium and long distances. Furthermore, adding the SAT-GCN module has a limited impact on the real-time performance and model parameters.

KW - 3D object detection

KW - Graph convolutional network

KW - Self-attention mechanism

UR - http://www.scopus.com/inward/record.url?scp=85142323930&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2022.110080

DO - 10.1016/j.knosys.2022.110080

M3 - Article

AN - SCOPUS:85142323930

SN - 0950-7051

VL - 259

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

M1 - 110080

ER -

SAT-GCN: Self-attention graph convolutional network-based 3D object detection for autonomous driving

摘要

访问文件

其它文件与链接

指纹

引用此