TY - JOUR
T1 - VID-SLAM
T2 - Robust Pose Estimation with RGBD-Inertial Input for Indoor Robotic Localization
AU - Shan, Dan
AU - Su, Jinhe
AU - Wang, Xiaofeng
AU - Liu, Yujun
AU - Zhou, Taojian
AU - Wu, Zebiao
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/1
Y1 - 2024/1
N2 - This study proposes a tightly coupled multi-sensor Simultaneous Localization and Mapping (SLAM) framework that integrates RGB-D and inertial measurements to achieve highly accurate 6 degree of freedom (6DOF) metric localization in a variety of environments. Through the consideration of geometric consistency, inertial measurement unit constraints, and visual re-projection errors, we present visual-inertial-depth odometry (called VIDO), an efficient state estimation back-end, to minimise the cascading losses of all factors. Existing visual-inertial odometers rely on visual feature-based constraints to eliminate the translational displacement and angular drift produced by Inertial Measurement Unit (IMU) noise. To mitigate these constraints, we introduce the iterative closest point error of adjacent frames and update the state vectors of observed frames through the minimisation of the estimation errors of all sensors. Moreover, the closed-loop module allows for further optimization of the global attitude map to correct the long-term drift. For experiments, we collect an RGBD-inertial data set for a comprehensive evaluation of VID-SLAM. The data set contains RGB-D image pairs, IMU measurements, and two types of ground truth data. The experimental results show that VID-SLAM achieves state-of-the-art positioning accuracy and outperforms mainstream vSLAM solutions, including ElasticFusion, ORB-SLAM2, and VINS-Mono.
AB - This study proposes a tightly coupled multi-sensor Simultaneous Localization and Mapping (SLAM) framework that integrates RGB-D and inertial measurements to achieve highly accurate 6 degree of freedom (6DOF) metric localization in a variety of environments. Through the consideration of geometric consistency, inertial measurement unit constraints, and visual re-projection errors, we present visual-inertial-depth odometry (called VIDO), an efficient state estimation back-end, to minimise the cascading losses of all factors. Existing visual-inertial odometers rely on visual feature-based constraints to eliminate the translational displacement and angular drift produced by Inertial Measurement Unit (IMU) noise. To mitigate these constraints, we introduce the iterative closest point error of adjacent frames and update the state vectors of observed frames through the minimisation of the estimation errors of all sensors. Moreover, the closed-loop module allows for further optimization of the global attitude map to correct the long-term drift. For experiments, we collect an RGBD-inertial data set for a comprehensive evaluation of VID-SLAM. The data set contains RGB-D image pairs, IMU measurements, and two types of ground truth data. The experimental results show that VID-SLAM achieves state-of-the-art positioning accuracy and outperforms mainstream vSLAM solutions, including ElasticFusion, ORB-SLAM2, and VINS-Mono.
KW - Simultaneous Localization and Mapping
KW - motion and tracking
KW - vision for robotics and autonomous vehicles
UR - http://www.scopus.com/inward/record.url?scp=85183356626&partnerID=8YFLogxK
U2 - 10.3390/electronics13020318
DO - 10.3390/electronics13020318
M3 - Article
AN - SCOPUS:85183356626
SN - 2079-9292
VL - 13
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 2
M1 - 318
ER -