VID-SLAM: Robust Pose Estimation with RGBD-Inertial Input for Indoor Robotic Localization

Dan Shan, Jinhe Su*, Xiaofeng Wang, Yujun Liu, Taojian Zhou, Zebiao Wu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

This study proposes a tightly coupled multi-sensor Simultaneous Localization and Mapping (SLAM) framework that integrates RGB-D and inertial measurements to achieve highly accurate 6 degree of freedom (6DOF) metric localization in a variety of environments. Through the consideration of geometric consistency, inertial measurement unit constraints, and visual re-projection errors, we present visual-inertial-depth odometry (called VIDO), an efficient state estimation back-end, to minimise the cascading losses of all factors. Existing visual-inertial odometers rely on visual feature-based constraints to eliminate the translational displacement and angular drift produced by Inertial Measurement Unit (IMU) noise. To mitigate these constraints, we introduce the iterative closest point error of adjacent frames and update the state vectors of observed frames through the minimisation of the estimation errors of all sensors. Moreover, the closed-loop module allows for further optimization of the global attitude map to correct the long-term drift. For experiments, we collect an RGBD-inertial data set for a comprehensive evaluation of VID-SLAM. The data set contains RGB-D image pairs, IMU measurements, and two types of ground truth data. The experimental results show that VID-SLAM achieves state-of-the-art positioning accuracy and outperforms mainstream vSLAM solutions, including ElasticFusion, ORB-SLAM2, and VINS-Mono.

Original languageEnglish
Article number318
JournalElectronics (Switzerland)
Volume13
Issue number2
DOIs
Publication statusPublished - Jan 2024

Keywords

  • Simultaneous Localization and Mapping
  • motion and tracking
  • vision for robotics and autonomous vehicles

Fingerprint

Dive into the research topics of 'VID-SLAM: Robust Pose Estimation with RGBD-Inertial Input for Indoor Robotic Localization'. Together they form a unique fingerprint.

Cite this