TY - GEN
T1 - Self-supervised Monocular Depth Estimation in Challenging Environments Based on Illumination Compensation PoseNet
AU - Hou, Shengyu
AU - Song, Wenjie
AU - Wang, Rongchuan
AU - Wang, Meiling
AU - Yang, Yi
AU - Fu, Mengyin
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Self-supervised depth estimation has attracted much attention due to its ability to improve the 3D perception capabilities of unmanned systems. However, existing unsupervised frameworks rely on the assumption of photometric consistency, which may not hold in challenging environments such as night-time, rainy nights, or snowy winters due to complex lighting and reflections, resulting in inconsistent photometry across different frames for the same pixel. To address this problem, we propose a self-supervised monocular depth estimation unified framework that can handle these complex scenarios, which has the following characteristics: (1) an Illumination Compensation PoseNet (ICP) is designed, which is based on the classic Phong illumination theory and compensates for lighting changes in adjacent frames by estimating per-pixel transformations; (2) a Dual-Axis Transformer (DAT) block is proposed as the backbone network of the depth encoder, which infers the depth of local repeat-texture areas through spatial-channel dual-dimensional global context information of images. Experimental results demonstrate that our approach achieves state-of-the-art depth estimation results in complex environments on the challenging Oxford RobotCar dataset.
AB - Self-supervised depth estimation has attracted much attention due to its ability to improve the 3D perception capabilities of unmanned systems. However, existing unsupervised frameworks rely on the assumption of photometric consistency, which may not hold in challenging environments such as night-time, rainy nights, or snowy winters due to complex lighting and reflections, resulting in inconsistent photometry across different frames for the same pixel. To address this problem, we propose a self-supervised monocular depth estimation unified framework that can handle these complex scenarios, which has the following characteristics: (1) an Illumination Compensation PoseNet (ICP) is designed, which is based on the classic Phong illumination theory and compensates for lighting changes in adjacent frames by estimating per-pixel transformations; (2) a Dual-Axis Transformer (DAT) block is proposed as the backbone network of the depth encoder, which infers the depth of local repeat-texture areas through spatial-channel dual-dimensional global context information of images. Experimental results demonstrate that our approach achieves state-of-the-art depth estimation results in complex environments on the challenging Oxford RobotCar dataset.
UR - http://www.scopus.com/inward/record.url?scp=85216444622&partnerID=8YFLogxK
U2 - 10.1109/IROS58592.2024.10801331
DO - 10.1109/IROS58592.2024.10801331
M3 - Conference contribution
AN - SCOPUS:85216444622
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 9396
EP - 9403
BT - 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024
Y2 - 14 October 2024 through 18 October 2024
ER -