MonoGAE: Roadside Monocular 3D Object Detection With Ground-Aware Embeddings

  • Lei Yang
  • , Xinyu Zhang*
  • , Jiaxin Yu
  • , Jun Li
  • , Tong Zhao
  • , Li Wang
  • , Yi Huang
  • , Chuang Zhang
  • , Hong Wang
  • , Yiming Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Citations (Scopus)

Abstract

Although the majority of recent autonomous driving systems concentrate on developing perception methods based on ego-vehicle sensors, there is an overlooked alternative approach that involves leveraging intelligent roadside cameras to help extend the ego-vehicle perception ability beyond the visual range. We discover that most existing monocular 3D object detectors rely on the ego-vehicle prior assumption that the optical axis of the camera is parallel to the ground. However, the roadside camera is installed on a pole with a pitched angle, which makes the existing methods not optimal for roadside scenes. In this paper, we introduce a novel framework for Roadside Monocular 3D object detection with ground-aware embeddings, named MonoGAE. Specifically, the ground plane is a stable and strong prior knowledge due to the fixed installation of cameras in roadside scenarios. In order to reduce the domain gap between the ground geometry information and high-dimensional image features, we employ a supervised training paradigm with a ground plane to predict high-dimensional ground-aware embeddings. These embeddings are subsequently integrated with image features through cross-attention mechanisms. Furthermore, to improve the detector's robustness to the divergences in cameras' installation poses, we replace the ground plane depth map with a novel pixel-level refined ground plane equation map. Our approach demonstrates a substantial performance advantage over all previous monocular 3D object detectors on widely recognized 3D detection benchmarks for roadside cameras. The code and pre-trained models will be released soon.

Original languageEnglish
Pages (from-to)17587-17601
Number of pages15
JournalIEEE Transactions on Intelligent Transportation Systems
Volume25
Issue number11
DOIs
Publication statusPublished - 2024
Externally publishedYes

Keywords

  • autonomous driving
  • Monocular 3D object detection
  • roadside perception

Fingerprint

Dive into the research topics of 'MonoGAE: Roadside Monocular 3D Object Detection With Ground-Aware Embeddings'. Together they form a unique fingerprint.

Cite this