PolarGFusion3D: Polar Graph Fusion Network for Enhanced Multimodal 3D Perception in Intelligent Vehicles

Luxing Li, Chao Wei

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Multimodal fusion technology significantly enhances the safety and perception capabilities of intelligent vehicles. Recently, replacing Cartesian coordinate system voxels with polar voxels in 3D perception tasks has significantly improved spatial occupancy rates and adaptability. However, the uneven distribution of voxels introduces new challenges: feature information distortion and reduced real-time performance. This paper proposes a multimodal fusion network based on polar graphs to address these issues. Raw data from LiDAR, cameras, and millimeter-wave (MMW) radar are initially preprocessed, and point-graph and voxel-graph structures in polar coordinates are constructed. Subsequently, using Graph Attention Networks (GAT), features are extracted and aggregated at multiple levels, forming a polar-based Bird's Eye View (BEV) feature map. At the BEV level, multimodal features are fused, and multi-scale features are aggregated using multi-scale GAT, culminating in the design of a polar-based CenterHead to complete the 3D perception task. Extensive experiments conducted on the nuScenes dataset and real vehicle test data have demonstrated that the detection precision (70.5% mAP) and inference speed (12.6 Hz) of the model's surpass those of comparative models, establishing a new state-of-the-art (SOTA). Additionally, the model exhibits high levels of perception accuracy, robustness, and generalizability across various real vehicle scenarios.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalIEEE Transactions on Intelligent Vehicles
DOIs
Publication statusAccepted/In press - 2024
Externally publishedYes

Keywords

  • 3D Perception
  • Cameras
  • Feature extraction
  • Graph Attention Networks
  • Intelligent Vehicles
  • Laser radar
  • Multimodal Fusion
  • Point cloud compression
  • Radar
  • Real-time systems
  • Three-dimensional displays

Fingerprint

Dive into the research topics of 'PolarGFusion3D: Polar Graph Fusion Network for Enhanced Multimodal 3D Perception in Intelligent Vehicles'. Together they form a unique fingerprint.

Cite this