A Collaborative Perception Network based on Dynamic Multi-scale Fusion

Yiming Li*, Meiling Wang, Xunjie He, Yufeng Yue*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Collaborative perception can improve perception performance by aggregating perception information from different perspectives of multiple agents, while solving the problems of obstacle occlusion or limited perception distance that may occur in single agents. However, when facing the inevitable transmission delays and localization errors in real-world communication, existing collaborative perception methods cannot effectively solve the problem of temporal-spatial misalignment, leading to serious decline in detection performance and robustness. In this paper, we propose a novel collaborative perception framework DynMSF(Dynamic Multi-Scale Fusion), that utilizes multi-scale strategies and dynamic information fusion to enhance both of the temporal and spatial robustness and improve the detection precision. Firstly, we introduce multi-scale collaboration (MSC) module, which collaborates on the perception information of agents at multiple scales to obtain spatial correlations at different scales, eliminating the negative effects caused by spatial misalignment. On the basis of multi-scale collaborative features, we propose a dynamic temporal fusion (DTF) module that dynamically fuses historical frame features stored in memory banks, enhances the feature and compensates for the transmission delay of the current frame. We conduct experiments on publicly available OPV2V and V2XSet datasets, and our model achieves the best performance compared to the baseline of existing methods. We also verify the strong temporal-spatial robustness of our model and the effectiveness of our proposed modules through noise robustness experiments and ablation study.

Original languageEnglish
Title of host publicationProceedings of the 43rd Chinese Control Conference, CCC 2024
EditorsJing Na, Jian Sun
PublisherIEEE Computer Society
Pages4061-4068
Number of pages8
ISBN (Electronic)9789887581581
DOIs
Publication statusPublished - 2024
Event43rd Chinese Control Conference, CCC 2024 - Kunming, China
Duration: 28 Jul 202431 Jul 2024

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference43rd Chinese Control Conference, CCC 2024
Country/TerritoryChina
CityKunming
Period28/07/2431/07/24

Keywords

  • 3D object detection
  • Collaborative perception
  • Dynamic temporal fusion
  • Multi-scale collaboration
  • Temporal-spatial misalignment

Cite this