TY - JOUR
T1 - PillarID
T2 - Rethinking Backbone Network Designs for Pillar-Based 3D Object Detection in Infrastructure Point Cloud
AU - Zhang, Zhang
AU - Sun, Chao
AU - Wang, Bo
AU - Wen, Da
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2026
Y1 - 2026
N2 - In recent years, vehicle-centric point cloud 3D object detection has been widely explored and effectively developed. However, due to differences in the placement of sensors, infrastructure-centric point cloud 3D object detection, which is an important component of the Intelligent Transportation System (ITS), has not received sufficient attention as well as effective network architecture design. Based on the difference in perspective of the infrastructure point cloud, We discover that the roadside point cloud is denser and with a higher coverage compared to the vehicle-side in the pillar representation, resulting in a narrowing of the performance difference between dense pillar and sparse pillar backbone networks in roadside scenes. Inspired by this insight, a network based on the dense backbone is proposed, dubbed PillarID. It utilizes Single-stride Cross-stage Dense-backbone (SCD) to obtains efficient computation through channel degradation, split, and cross-stage connection, and benefits from the rich context of the roadside point cloud based on single-stride. Further, Hierarchical Receptive-field Expansion (HRE) are used to address the receptive field constraints of single-stride backbone. Extensive experiments reveal that our PillarID achieves effective designs in terms of architecture and renders the state-of-the-art performance on the popular large-scale roadside benchmark: DAIR-V2X-I and RCooper.
AB - In recent years, vehicle-centric point cloud 3D object detection has been widely explored and effectively developed. However, due to differences in the placement of sensors, infrastructure-centric point cloud 3D object detection, which is an important component of the Intelligent Transportation System (ITS), has not received sufficient attention as well as effective network architecture design. Based on the difference in perspective of the infrastructure point cloud, We discover that the roadside point cloud is denser and with a higher coverage compared to the vehicle-side in the pillar representation, resulting in a narrowing of the performance difference between dense pillar and sparse pillar backbone networks in roadside scenes. Inspired by this insight, a network based on the dense backbone is proposed, dubbed PillarID. It utilizes Single-stride Cross-stage Dense-backbone (SCD) to obtains efficient computation through channel degradation, split, and cross-stage connection, and benefits from the rich context of the roadside point cloud based on single-stride. Further, Hierarchical Receptive-field Expansion (HRE) are used to address the receptive field constraints of single-stride backbone. Extensive experiments reveal that our PillarID achieves effective designs in terms of architecture and renders the state-of-the-art performance on the popular large-scale roadside benchmark: DAIR-V2X-I and RCooper.
KW - Point cloud
KW - infrastructure
KW - object detection
UR - https://www.scopus.com/pages/publications/105024116544
U2 - 10.1109/TITS.2025.3633156
DO - 10.1109/TITS.2025.3633156
M3 - Article
AN - SCOPUS:105024116544
SN - 1524-9050
VL - 27
SP - 232
EP - 240
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
IS - 1
ER -