Depth prediction method based on monocular images and sparse point cloud fusion

  • Zijie Ma
  • , Xia Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Monocular depth estimation suffers from inherent limitations in directly acquiring depth information, resulting in constrained prediction accuracy. This paper proposes a depth prediction method based on the fusion of monocular images and sparse point clouds, combining joint calibration with a multimodal fusion algorithm. The joint calibration method employed in this study utilizes a graph neural network to achieve the registration of image and point cloud key points through local feature matching, thereby reducing depth prediction errors at the dataset level. After registration, a residual neural network-based multimodal fusion algorithm is adopted to extract and fuse features from the monocular image and sparse point cloud, ultimately outputting a depth image. Experimental results demonstrate that, compared with monocular image depth estimation, the proposed method improves depth prediction accuracy from 57% to 93% on the KITTI dataset and from 55% to 86% on a self-constructed tank target dataset. This approach achieves high-precision depth prediction for outdoor environmental targets, providing a reliable solution for enhancing depth estimation accuracy.

Original languageEnglish
Title of host publicationAdvanced Optical Imaging Technologies VIII
EditorsXiao-Cong Yuan, P. Scott Carney, Kebin Shi
PublisherSPIE
ISBN (Electronic)9781510693869
DOIs
Publication statusPublished - 21 Nov 2025
Externally publishedYes
Event8th Advanced Optical Imaging Technologies - Beijing, China
Duration: 12 Oct 202514 Oct 2025

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume13717
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference8th Advanced Optical Imaging Technologies
Country/TerritoryChina
CityBeijing
Period12/10/2514/10/25

Keywords

  • Depth prediction
  • Joint calibration
  • Multimodal fusion
  • Sparse point cloud

Fingerprint

Dive into the research topics of 'Depth prediction method based on monocular images and sparse point cloud fusion'. Together they form a unique fingerprint.

Cite this