跳到主要导航 跳到搜索 跳到主要内容

Depth-Assisted Camera-Based Bird's Eye View Perception for Autonomous Driving

  • Shangwei Guo*
  • , Jin Lu
  • , Zhengchao Lai
  • , Jun Li
  • , Shaokun Han*
  • *此作品的通讯作者
  • Beijing Institute of Technology
  • Xi'an Electronic Engineering Research Institute

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Vision-centric Bird's Eye View (BEV) perception, encompassing object detection and map segmentation, plays a pivotal role in providing crucial 3D environmental information for autonomous driving decisions. However, due to the inherent absence of depth information in 2D images, the conversion of perspective views to BEV poses challenges and hinders the performance of camera-based BEV perception in comparison to methods equipped with depth sensors. In this research paper, we propose an innovative approach that integrates depth estimation into camera-based BEV perception. By employing a depth estimation network, the method enhances the transformation of 2D-3D features. Specifically, our method consists of a depth estimation branch and a BEV perception branch. The input image is fed into the shared image encoder to extract multi-scale features. In the depth estimation branch, these features are utilized to generate a depth map through the depth decoder, which, in combination with sequential images and relative pose information, forms the basis for reprojection photometric error, guiding and supervising the branch. To address the challenge of scale ambiguity in monocular depth estimation, we incorporate ground-truth trajectory information collected by an IMU to constrain the predicted depth values, ensuring that the predicted depth is scale-aware. In the BEV perception branch, the afore-mentioned multi-scale features are projected into 3D space along the perspective rays, with the assistance of depth information derived from the depth estimation branch. Subsequently, the 3D features are collapsed along the vertical axis to generate BEV features, which are further input into a task-specific head after feature extraction. Experimental results on the nuScenes dataset demonstrate that our proposed method effectively enhances the performance of BEV-based object detection and map semantic segmentation by 2.8 % and 2.2 %, respectively.

源语言英语
主期刊名IEEE ITAIC 2023 - IEEE 11th Joint International Information Technology and Artificial Intelligence Conference
编辑Bing Xu, Kefen Mou
出版商Institute of Electrical and Electronics Engineers Inc.
1429-1433
页数5
ISBN(电子版)9798350333664
DOI
出版状态已出版 - 2023
活动11th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2023 - Chongqing, 中国
期限: 8 12月 202310 12月 2023

出版系列

姓名IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC)
ISSN(印刷版)2693-2865

会议

会议11th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2023
国家/地区中国
Chongqing
时期8/12/2310/12/23

指纹

探究 'Depth-Assisted Camera-Based Bird's Eye View Perception for Autonomous Driving' 的科研主题。它们共同构成独一无二的指纹。

引用此