TY - JOUR
T1 - Sem-Aug
T2 - Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection
AU - Zhao, Lin
AU - Wang, Meiling
AU - Yue, Yufeng
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2022/10/1
Y1 - 2022/10/1
N2 - Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.
AB - Camera-LiDAR fusion provides precise distance measurements and fine-grained textures, making it a promising option for 3D vehicle detection in autonomous driving scenarios. Previous camera-LiDAR based 3D vehicle detection approaches mainly focused on employing image-based pre-trained models to fetch semantic features. However, these methods may perform inferior to the LiDAR-based ones when lacking semantic segmentation labels in autonomous driving tasks. Motivated by this observation, we propose a novel semantic augmentation method, namely Sem-Aug, to guide high-confidence camera-LiDAR fusion feature generation and boost the performance of multimodal 3D vehicle detection. The key novelty of semantic augmentation lies in the 2D segmentation mask auto-labeling, which provides supervision for semantic segmentation sub-network to mitigate the poor generalization performance of camera-LiDAR fusion. Using semantic-augmentation-guided camera-LiDAR fusion features, Sem-Aug achieves remarkable performance on the representative autonomous driving KITTI dataset compared to both the LiDAR-based baseline and previous multimodal 3D vehicle detectors. Qualitative and quantitative experiments demonstrate that Sem-Aug provides significant improvements in challenging Hard detection scenarios caused by occlusion and truncation.
KW - Computer vision for transportation
KW - intelligent transportation systems
KW - object detection
KW - segmentation and categorization
UR - http://www.scopus.com/inward/record.url?scp=85135239655&partnerID=8YFLogxK
U2 - 10.1109/LRA.2022.3191208
DO - 10.1109/LRA.2022.3191208
M3 - Article
AN - SCOPUS:85135239655
SN - 2377-3766
VL - 7
SP - 9358
EP - 9365
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 4
ER -