Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection

Lin Zhao; Meiling Wang; Yufeng Yue

doi:10.1109/LRA.2022.3191208

Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection

Lin Zhao, Meiling Wang, Yufeng Yue^*

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

15 引用（Scopus）

摘要

Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.

源语言	英语
页（从-至）	9358-9365
页数	8
期刊	IEEE Robotics and Automation Letters
卷	7
期	4
DOI	https://doi.org/10.1109/LRA.2022.3191208
出版状态	已出版 - 1 10月 2022

访问文件

10.1109/LRA.2022.3191208

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f1fd3f0e160247e7a533c1f63a5bf5ee,

title = "Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection",

abstract = "Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.",

keywords = "Computer vision for transportation, intelligent transportation systems, object detection, segmentation and categorization",

author = "Lin Zhao and Meiling Wang and Yufeng Yue",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.",

year = "2022",

month = oct,

day = "1",

doi = "10.1109/LRA.2022.3191208",

language = "English",

volume = "7",

pages = "9358--9365",

journal = "IEEE Robotics and Automation Letters",

issn = "2377-3766",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

TY - JOUR

T1 - Sem-Aug

T2 - Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection

AU - Zhao, Lin

AU - Wang, Meiling

AU - Yue, Yufeng

PY - 2022/10/1

Y1 - 2022/10/1

N2 - Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.

AB - Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.

KW - Computer vision for transportation

KW - intelligent transportation systems

KW - object detection

KW - segmentation and categorization

UR - http://www.scopus.com/inward/record.url?scp=85135239655&partnerID=8YFLogxK

U2 - 10.1109/LRA.2022.3191208

DO - 10.1109/LRA.2022.3191208

M3 - Article

AN - SCOPUS:85135239655

SN - 2377-3766

VL - 7

SP - 9358

EP - 9365

JO - IEEE Robotics and Automation Letters

JF - IEEE Robotics and Automation Letters

IS - 4

ER -

Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection

摘要

访问文件

其它文件与链接

指纹

引用此