摘要
This paper proposes a novel multimodal collaborative perception framework to enhance the situational awareness of autonomous vehicles. First, a multimodal fusion baseline system is built that effectively integrates Light Detection and Ranging (LiDAR) point clouds and camera images. This system provides a comparable benchmark for subsequent research. Second, various well-known feature fusion strategies are investigated in the context of collaborative scenarios, including channel-wise concatenation, element-wise summation, and transformer-based methods. This study aims to seamlessly integrate intermediate representations from different sensor modalities, facilitating an exhaustive assessment of their effects on model performance. Extensive experiments were conducted on a large-scale open-source simulation dataset, i.e., OPV2V. The results showed that attention-based multimodal fusion outperforms alternative solutions, delivering more precise target localization during complex traffic scenarios, thereby enhancing the safety and reliability of autonomous driving systems.
投稿的翻译标题 | Collaborative Perception Method Based on Multisensor Fusion |
---|---|
源语言 | 繁体中文 |
页(从-至) | 87-96 |
页数 | 10 |
期刊 | Journal of Radars |
卷 | 13 |
期 | 1 |
DOI | |
出版状态 | 已出版 - 2024 |
关键词
- 3D object detection
- Autonomous driving
- Collaborative perception
- Intelligent transportation systems
- Multimodal fusion