TY - GEN
T1 - 3D Occupancy Perception Network Based on Temporal Fusion of Bird's-Eye-View Features
AU - Wu, Shaobin
AU - Li, Yixuan
AU - Chu, Yunfeng
AU - Lin, Xuze
AU - Tan, Sheng
AU - Li, Xiaoan
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In view of the difficulties of long-tailed obstacle perception and the high complexity of dynamic environment modeling in unmanned driving scenarios, this paper proposes a 3D occupancy perception network based on temporal fusion of bird's-eye-view (BEV) features. Firstly, image features are extracted by image backbone network and mapped into BEV features, and then BEV features are temporally fused by deformable attention mechanism. Secondly, a dual-branch prediction structure of 3D semantic occupancy and 2D velocity flow field is designed to decouple the heterogeneous tasks. It realizes fine-grained voxel semantic occupancy prediction through 3D convolution, and generates 2D velocity flow field combined with temporal cost volume matching mechanism, so as to reduce multi-task competition while maintaining real-time performance. Finally, a dynamic supervision strategy is proposed, which uses ray-extended sampling to generate key voxel supervision masks covering obstacles and surrounding empty voxels, and combines random sampling of empty voxels with differentiated supervision of dynamic and static voxels to alleviate the imbalance of category distribution and suppress the trailing effect of prediction. Experiments on the FlowOcc3D dataset and real vehicle show that the proposed network achieves good performance in both semantic occupancy and velocity prediction, which verifies its effectiveness in various driving scenarios. Its lightweight design provides reliable support for real-time environment perception and path planning of unmanned systems, and promotes the application of 3D occupancy perception technology in various scenarios.
AB - In view of the difficulties of long-tailed obstacle perception and the high complexity of dynamic environment modeling in unmanned driving scenarios, this paper proposes a 3D occupancy perception network based on temporal fusion of bird's-eye-view (BEV) features. Firstly, image features are extracted by image backbone network and mapped into BEV features, and then BEV features are temporally fused by deformable attention mechanism. Secondly, a dual-branch prediction structure of 3D semantic occupancy and 2D velocity flow field is designed to decouple the heterogeneous tasks. It realizes fine-grained voxel semantic occupancy prediction through 3D convolution, and generates 2D velocity flow field combined with temporal cost volume matching mechanism, so as to reduce multi-task competition while maintaining real-time performance. Finally, a dynamic supervision strategy is proposed, which uses ray-extended sampling to generate key voxel supervision masks covering obstacles and surrounding empty voxels, and combines random sampling of empty voxels with differentiated supervision of dynamic and static voxels to alleviate the imbalance of category distribution and suppress the trailing effect of prediction. Experiments on the FlowOcc3D dataset and real vehicle show that the proposed network achieves good performance in both semantic occupancy and velocity prediction, which verifies its effectiveness in various driving scenarios. Its lightweight design provides reliable support for real-time environment perception and path planning of unmanned systems, and promotes the application of 3D occupancy perception technology in various scenarios.
KW - 3D occupancy
KW - environment perception
KW - temporal fusion
KW - unmanned driving
UR - https://www.scopus.com/pages/publications/105031891669
U2 - 10.1109/ICUS66297.2025.11294187
DO - 10.1109/ICUS66297.2025.11294187
M3 - Conference contribution
AN - SCOPUS:105031891669
T3 - Proceedings of 2025 IEEE International Conference on Unmanned Systems, ICUS 2025
SP - 686
EP - 694
BT - Proceedings of 2025 IEEE International Conference on Unmanned Systems, ICUS 2025
A2 - Song, Rong
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE International Conference on Unmanned Systems, ICUS 2025
Y2 - 18 September 2025 through 19 September 2025
ER -