Height3D: A Roadside Visual Framework Based on Height Prediction in Real 3-D Space

Zhang Zhang, Chao Sun*, Bo Wang, Bin Guo, Da Wen, Tianyi Zhu, Qili Ning

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, vision-based roadside 3D objectdetection has received a great deal of attention, which isan important part of the Intelligent Transportation System(ITS). It extends the perception range beyond the limitationsof Autonomous Vehicle (AV) and enhances road safety. Whileprevious work mainly focuses on height prediction in image2D space, which is limited by the perspective property ofnear-large and far-small on images, making it difficult fornetwork to understand real dimension of targets in the 3D world.Inspired by this insight, a roadside visual framework Height3Dbased on height prediction in real 3D space, is proposed.Height Prediction Block (HPB) with explicit height supervisionis proposed in real 3D space instead of in image 2D spaceto predict the height distribution of targets for roadside viewtransform. Also, Spatial Aware Block (SAB) is used to furtherextracts spatial context information in BEV space and enhancesfine-grained BEV features. The proposed method is applied totwo large-scale roadside benchmarks, DAIR-V2X-I and Rope3D.Extensive experiments are performed to verify its effectiveness.The proposed Height3D outperforms the state-of-the-art methodsof (1.15, 7.37, 4.03) Average Precision (AP) for Vehicle, Pedestrianand Cyclist categories in 3D object detection task, respectively.Meanwhile, the proposed method achieves 31.55 FPS withoutusing any CUDA or TensorRT acceleration.

Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • height prediction
  • roadside perception
  • Vision

Cite this