Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images

Huoling Luo; Congcong Wang; Xingguang Duan; Hao Liu; Ping Wang; Qingmao Hu; Fucang Jia

doi:10.1016/j.compbiomed.2021.105109

Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images

Huoling Luo, Congcong Wang, Xingguang Duan, Hao Liu, Ping Wang, Qingmao Hu, Fucang Jia^*

^*Corresponding author for this work

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

17 Citations (Scopus)

Abstract

Background: Learning-based methods have achieved remarkable performances on depth estimation. However, the premise of most self-learning and unsupervised learning methods is built on rigorous, geometrically-aligned stereo rectification. The performances of these methods degrade when the rectification is not accurate. Therefore, we explore an approach for unsupervised depth estimation from stereo images that can handle imperfect camera parameters. Methods: We propose an unsupervised deep convolutional network that takes rectified stereo image pairs as input and outputs corresponding dense disparity maps. First, a new vertical correction module is designed for predicting a correction map to compensate for the imperfect geometry alignment. Second, the left and right images, which are reconstructed based on the input image pair and corresponding disparities as well as the vertical correction maps, are regarded as the outputs of the generative term of the generative adversarial network (GAN). Then, the discriminator term of the GAN is used to distinguish the reconstructed images from the original inputs to force the generator to output increasingly realistic images. In addition, a residual mask is introduced to exclude pixels that conflict with the appearance of the original image in the loss calculation. Results: The proposed model is validated on the publicly available Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) dataset and the average MAE is 3.054 mm. Conclusion: Our model can effectively handle imperfect rectified stereo images for depth estimation.

Original language	English
Article number	105109
Journal	Computers in Biology and Medicine
Volume	140
DOIs	https://doi.org/10.1016/j.compbiomed.2021.105109
Publication status	Published - Jan 2022

Keywords

Depth estimation
Imperfect rectified stereo images
Laparoscopic image
Stereo matching
Unsupervised learning

Access to Document

10.1016/j.compbiomed.2021.105109

Cite this

@article{d50ca1a0402043f885b1ed6a112e860b,

title = "Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images",

abstract = "Background: Learning-based methods have achieved remarkable performances on depth estimation. However, the premise of most self-learning and unsupervised learning methods is built on rigorous, geometrically-aligned stereo rectification. The performances of these methods degrade when the rectification is not accurate. Therefore, we explore an approach for unsupervised depth estimation from stereo images that can handle imperfect camera parameters. Methods: We propose an unsupervised deep convolutional network that takes rectified stereo image pairs as input and outputs corresponding dense disparity maps. First, a new vertical correction module is designed for predicting a correction map to compensate for the imperfect geometry alignment. Second, the left and right images, which are reconstructed based on the input image pair and corresponding disparities as well as the vertical correction maps, are regarded as the outputs of the generative term of the generative adversarial network (GAN). Then, the discriminator term of the GAN is used to distinguish the reconstructed images from the original inputs to force the generator to output increasingly realistic images. In addition, a residual mask is introduced to exclude pixels that conflict with the appearance of the original image in the loss calculation. Results: The proposed model is validated on the publicly available Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) dataset and the average MAE is 3.054 mm. Conclusion: Our model can effectively handle imperfect rectified stereo images for depth estimation.",

keywords = "Depth estimation, Imperfect rectified stereo images, Laparoscopic image, Stereo matching, Unsupervised learning",

author = "Huoling Luo and Congcong Wang and Xingguang Duan and Hao Liu and Ping Wang and Qingmao Hu and Fucang Jia",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier Ltd",

year = "2022",

month = jan,

doi = "10.1016/j.compbiomed.2021.105109",

language = "English",

volume = "140",

journal = "Computers in Biology and Medicine",

issn = "0010-4825",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images

AU - Luo, Huoling

AU - Wang, Congcong

AU - Duan, Xingguang

AU - Liu, Hao

AU - Wang, Ping

AU - Hu, Qingmao

AU - Jia, Fucang

PY - 2022/1

Y1 - 2022/1

N2 - Background: Learning-based methods have achieved remarkable performances on depth estimation. However, the premise of most self-learning and unsupervised learning methods is built on rigorous, geometrically-aligned stereo rectification. The performances of these methods degrade when the rectification is not accurate. Therefore, we explore an approach for unsupervised depth estimation from stereo images that can handle imperfect camera parameters. Methods: We propose an unsupervised deep convolutional network that takes rectified stereo image pairs as input and outputs corresponding dense disparity maps. First, a new vertical correction module is designed for predicting a correction map to compensate for the imperfect geometry alignment. Second, the left and right images, which are reconstructed based on the input image pair and corresponding disparities as well as the vertical correction maps, are regarded as the outputs of the generative term of the generative adversarial network (GAN). Then, the discriminator term of the GAN is used to distinguish the reconstructed images from the original inputs to force the generator to output increasingly realistic images. In addition, a residual mask is introduced to exclude pixels that conflict with the appearance of the original image in the loss calculation. Results: The proposed model is validated on the publicly available Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) dataset and the average MAE is 3.054 mm. Conclusion: Our model can effectively handle imperfect rectified stereo images for depth estimation.

AB - Background: Learning-based methods have achieved remarkable performances on depth estimation. However, the premise of most self-learning and unsupervised learning methods is built on rigorous, geometrically-aligned stereo rectification. The performances of these methods degrade when the rectification is not accurate. Therefore, we explore an approach for unsupervised depth estimation from stereo images that can handle imperfect camera parameters. Methods: We propose an unsupervised deep convolutional network that takes rectified stereo image pairs as input and outputs corresponding dense disparity maps. First, a new vertical correction module is designed for predicting a correction map to compensate for the imperfect geometry alignment. Second, the left and right images, which are reconstructed based on the input image pair and corresponding disparities as well as the vertical correction maps, are regarded as the outputs of the generative term of the generative adversarial network (GAN). Then, the discriminator term of the GAN is used to distinguish the reconstructed images from the original inputs to force the generator to output increasingly realistic images. In addition, a residual mask is introduced to exclude pixels that conflict with the appearance of the original image in the loss calculation. Results: The proposed model is validated on the publicly available Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED) dataset and the average MAE is 3.054 mm. Conclusion: Our model can effectively handle imperfect rectified stereo images for depth estimation.

KW - Depth estimation

KW - Imperfect rectified stereo images

KW - Laparoscopic image

KW - Stereo matching

KW - Unsupervised learning

UR - http://www.scopus.com/inward/record.url?scp=85120657673&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2021.105109

DO - 10.1016/j.compbiomed.2021.105109

M3 - Article

C2 - 34891097

AN - SCOPUS:85120657673

SN - 0010-4825

VL - 140

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

M1 - 105109

ER -

Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this