基于多尺度时序采样的多任务感知网络

Translated title of the contribution: Multi Task Perception Network Based on Multi-Scale Temporal Sampling
  • Shaobin Wu*
  • , Yunfeng Chu
  • , Yixuan Li
  • , Haojian Jiang
  • , Yu Huang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Integrating temporal multi-scale bird’s eye view (BEV) features, a novel multi-task perception network was proposed to solve the problems of insufficient temporal feature fusion and the difficulty in reliably perceiving occluded or distant targets. Firstly, modeling the depth prediction probability, a module with occlusion adaptability was established to estimate visible depth, map the image features into BEV features and carry out supervision based on the depth maps. Afterwards, in order to improve the effectiveness of long-distance obstacle detection, a temporal BEV sampling module was designed based on deformable attention mechanism to make multi-scale BEV feature weighted fusion in time sequence. Finally, expanding data augmentation strategies to multi tasks, 3D object detection and lane line segmentation were achieved according to corresponding task heads separately. The results from nuScenes dataset and real-vehicle experiment show that this solution can improve the accuracy in detecting occluded areas and distant targets, and the inference speed can meet the requirements of real-world applications.

Translated title of the contributionMulti Task Perception Network Based on Multi-Scale Temporal Sampling
Original languageChinese (Traditional)
Pages (from-to)789-797
Number of pages9
JournalBeijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
Volume45
Issue number8
DOIs
Publication statusPublished - Aug 2025

Fingerprint

Dive into the research topics of 'Multi Task Perception Network Based on Multi-Scale Temporal Sampling'. Together they form a unique fingerprint.

Cite this