TY - GEN
T1 - Self-Supervised Monocular Depth Estimation for Dynamic Targets
AU - Zhao, Jing
AU - Dong, Liquan
AU - Liu, Haojie
AU - Lv, Chengwei
AU - Zhang, Rujia
AU - Kong, Lingqin
AU - Liu, Ming
N1 - Publisher Copyright:
© 2025 SPIE.
PY - 2025
Y1 - 2025
N2 - Monocular depth estimation is one of the classic problems in the field of computer vision. It is widely used in fields such as 3D scene reconstruction and augmented reality. In this paper, a self-supervised monocular depth estimation model for dynamic targets is designed to solve the problems of current unsupervised monocular depth estimation in moving scenes. To solve the problem of insufficient accuracy of depth estimation in dynamic areas, a method of strengthening the fusion of local dynamic area features with a cross-attention mechanism is proposed, which refines the overall depth structure, expands the receptive field and enhances the representation ability of target area features. At the same time, a depth prior, namely pseudo-depth, is generated. The problem of blurred edges of dynamic targets is solved by matching the surface normals of the predicted depth and pseudo-depth and by constraining the relative normal angles of the two depth maps around the edge of the dynamic area to be consistent. Two channel attention modules are also designed to effectively integrate semantic information from different scales, so as to more fully integrate features of different scales and improve the overall modeling effect. We conduct experiments on the KITTI and DDAD datasets. The experimental results show that the proposed method outperforms the mainstream monocular depth estimation methods, especially in dynamic areas, showing better depth estimation performance. The absolute relative error and square relative error of the proposed method are reduced by up to 30% and 70% respectively compared with the baseline network.
AB - Monocular depth estimation is one of the classic problems in the field of computer vision. It is widely used in fields such as 3D scene reconstruction and augmented reality. In this paper, a self-supervised monocular depth estimation model for dynamic targets is designed to solve the problems of current unsupervised monocular depth estimation in moving scenes. To solve the problem of insufficient accuracy of depth estimation in dynamic areas, a method of strengthening the fusion of local dynamic area features with a cross-attention mechanism is proposed, which refines the overall depth structure, expands the receptive field and enhances the representation ability of target area features. At the same time, a depth prior, namely pseudo-depth, is generated. The problem of blurred edges of dynamic targets is solved by matching the surface normals of the predicted depth and pseudo-depth and by constraining the relative normal angles of the two depth maps around the edge of the dynamic area to be consistent. Two channel attention modules are also designed to effectively integrate semantic information from different scales, so as to more fully integrate features of different scales and improve the overall modeling effect. We conduct experiments on the KITTI and DDAD datasets. The experimental results show that the proposed method outperforms the mainstream monocular depth estimation methods, especially in dynamic areas, showing better depth estimation performance. The absolute relative error and square relative error of the proposed method are reduced by up to 30% and 70% respectively compared with the baseline network.
KW - Dynamic Region Feature Enhancement
KW - Monocular Depth Estimation
KW - Self-Supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85219364583&partnerID=8YFLogxK
U2 - 10.1117/12.3057042
DO - 10.1117/12.3057042
M3 - Conference contribution
AN - SCOPUS:85219364583
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Tenth Symposium on Novel Optoelectronic Detection Technology and Applications
A2 - Ping, Chen
PB - SPIE
T2 - 10th Symposium on Novel Optoelectronic Detection Technology and Applications
Y2 - 1 November 2024 through 3 November 2024
ER -