TY - JOUR
T1 - Height3D
T2 - A Roadside Visual Framework Based on Height Prediction in Real 3-D Space
AU - Zhang, Zhang
AU - Sun, Chao
AU - Wang, Bo
AU - Guo, Bin
AU - Wen, Da
AU - Zhu, Tianyi
AU - Ning, Qili
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - In recent years, vision-based roadside 3D objectdetection has received a great deal of attention, which isan important part of the Intelligent Transportation System(ITS). It extends the perception range beyond the limitationsof Autonomous Vehicle (AV) and enhances road safety. Whileprevious work mainly focuses on height prediction in image2D space, which is limited by the perspective property ofnear-large and far-small on images, making it difficult fornetwork to understand real dimension of targets in the 3D world.Inspired by this insight, a roadside visual framework Height3Dbased on height prediction in real 3D space, is proposed.Height Prediction Block (HPB) with explicit height supervisionis proposed in real 3D space instead of in image 2D spaceto predict the height distribution of targets for roadside viewtransform. Also, Spatial Aware Block (SAB) is used to furtherextracts spatial context information in BEV space and enhancesfine-grained BEV features. The proposed method is applied totwo large-scale roadside benchmarks, DAIR-V2X-I and Rope3D.Extensive experiments are performed to verify its effectiveness.The proposed Height3D outperforms the state-of-the-art methodsof (1.15, 7.37, 4.03) Average Precision (AP) for Vehicle, Pedestrianand Cyclist categories in 3D object detection task, respectively.Meanwhile, the proposed method achieves 31.55 FPS withoutusing any CUDA or TensorRT acceleration.
AB - In recent years, vision-based roadside 3D objectdetection has received a great deal of attention, which isan important part of the Intelligent Transportation System(ITS). It extends the perception range beyond the limitationsof Autonomous Vehicle (AV) and enhances road safety. Whileprevious work mainly focuses on height prediction in image2D space, which is limited by the perspective property ofnear-large and far-small on images, making it difficult fornetwork to understand real dimension of targets in the 3D world.Inspired by this insight, a roadside visual framework Height3Dbased on height prediction in real 3D space, is proposed.Height Prediction Block (HPB) with explicit height supervisionis proposed in real 3D space instead of in image 2D spaceto predict the height distribution of targets for roadside viewtransform. Also, Spatial Aware Block (SAB) is used to furtherextracts spatial context information in BEV space and enhancesfine-grained BEV features. The proposed method is applied totwo large-scale roadside benchmarks, DAIR-V2X-I and Rope3D.Extensive experiments are performed to verify its effectiveness.The proposed Height3D outperforms the state-of-the-art methodsof (1.15, 7.37, 4.03) Average Precision (AP) for Vehicle, Pedestrianand Cyclist categories in 3D object detection task, respectively.Meanwhile, the proposed method achieves 31.55 FPS withoutusing any CUDA or TensorRT acceleration.
KW - height prediction
KW - roadside perception
KW - Vision
UR - http://www.scopus.com/inward/record.url?scp=105005281803&partnerID=8YFLogxK
U2 - 10.1109/TITS.2025.3551553
DO - 10.1109/TITS.2025.3551553
M3 - Article
AN - SCOPUS:105005281803
SN - 1524-9050
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
ER -