Self-Supervised Monocular Depth Estimation for Endoscopic Imaging

Changsheng Li; Xue Li; Kaifeng Wang; Wenxin Chen; Qingyao Liu; Xingguang Duan

doi:10.1109/JBHI.2024.3434372

Self-Supervised Monocular Depth Estimation for Endoscopic Imaging

Changsheng Li, Xue Li, Kaifeng Wang, Wenxin Chen, Qingyao Liu, Xingguang Duan

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Endoscopy holds a pivotal role in the early detection and treatment of diverse diseases, with artificial intelligence (AI)-assisted methods increasingly gaining prominence in disease screening. Among them, the depth estimation from endoscopic sequences is crucial for a spectrum of AI-assisted surgical techniques. However, the development of endoscopic depth estimation algorithms presents a formidable challenge due to the unique environmental intricacies and constraints within the dataset. This paper proposes a self-supervised depth estimation network to comprehensively explore the brightness changes in endoscopic images, and fuse different features at multiple levels to achieve an accurate prediction of endoscopic depth. First, a FlowNet is designed to evaluate the brightness changes of adjacent frames by calculating the multi-scale structural similarity. Second, a feature fusion module is presented to capture multi-scale contextual information. Experiments show that the average accuracy of the algorithm is 97.03% in the Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED dataset). Based on the training parameters of the SCARED dataset, the algorithm achieves superior performance on the other two datasets (EndoSLAM and KVASIR dataset), indicating that the algorithm has good generalization performance.

Original language	English
Pages (from-to)	1-11
Number of pages	11
Journal	IEEE Journal of Biomedical and Health Informatics
DOIs	https://doi.org/10.1109/JBHI.2024.3434372
Publication status	Accepted/In press - 2024

Keywords

Accuracy
Brightness
brightness inconsistency
depth estimation
Endoscopes
Estimation
feature fusion
self-supervised learning
Surgery
Surgical vision
Three-dimensional displays
Training

Access to Document

10.1109/JBHI.2024.3434372

Cite this

@article{be1464aa42a6453d8c3f78652d96433c,

title = "Self-Supervised Monocular Depth Estimation for Endoscopic Imaging",

abstract = "Endoscopy holds a pivotal role in the early detection and treatment of diverse diseases, with artificial intelligence (AI)-assisted methods increasingly gaining prominence in disease screening. Among them, the depth estimation from endoscopic sequences is crucial for a spectrum of AI-assisted surgical techniques. However, the development of endoscopic depth estimation algorithms presents a formidable challenge due to the unique environmental intricacies and constraints within the dataset. This paper proposes a self-supervised depth estimation network to comprehensively explore the brightness changes in endoscopic images, and fuse different features at multiple levels to achieve an accurate prediction of endoscopic depth. First, a FlowNet is designed to evaluate the brightness changes of adjacent frames by calculating the multi-scale structural similarity. Second, a feature fusion module is presented to capture multi-scale contextual information. Experiments show that the average accuracy of the algorithm is 97.03% in the Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED dataset). Based on the training parameters of the SCARED dataset, the algorithm achieves superior performance on the other two datasets (EndoSLAM and KVASIR dataset), indicating that the algorithm has good generalization performance.",

keywords = "Accuracy, Brightness, brightness inconsistency, depth estimation, Endoscopes, Estimation, feature fusion, self-supervised learning, Surgery, Surgical vision, Three-dimensional displays, Training",

author = "Changsheng Li and Xue Li and Kaifeng Wang and Wenxin Chen and Qingyao Liu and Xingguang Duan",

note = "Publisher Copyright: IEEE",

year = "2024",

doi = "10.1109/JBHI.2024.3434372",

language = "English",

pages = "1--11",

journal = "IEEE Journal of Biomedical and Health Informatics",

issn = "2168-2194",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Self-Supervised Monocular Depth Estimation for Endoscopic Imaging

AU - Li, Changsheng

AU - Li, Xue

AU - Wang, Kaifeng

AU - Chen, Wenxin

AU - Liu, Qingyao

AU - Duan, Xingguang

N1 - Publisher Copyright: IEEE

PY - 2024

Y1 - 2024

N2 - Endoscopy holds a pivotal role in the early detection and treatment of diverse diseases, with artificial intelligence (AI)-assisted methods increasingly gaining prominence in disease screening. Among them, the depth estimation from endoscopic sequences is crucial for a spectrum of AI-assisted surgical techniques. However, the development of endoscopic depth estimation algorithms presents a formidable challenge due to the unique environmental intricacies and constraints within the dataset. This paper proposes a self-supervised depth estimation network to comprehensively explore the brightness changes in endoscopic images, and fuse different features at multiple levels to achieve an accurate prediction of endoscopic depth. First, a FlowNet is designed to evaluate the brightness changes of adjacent frames by calculating the multi-scale structural similarity. Second, a feature fusion module is presented to capture multi-scale contextual information. Experiments show that the average accuracy of the algorithm is 97.03% in the Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED dataset). Based on the training parameters of the SCARED dataset, the algorithm achieves superior performance on the other two datasets (EndoSLAM and KVASIR dataset), indicating that the algorithm has good generalization performance.

AB - Endoscopy holds a pivotal role in the early detection and treatment of diverse diseases, with artificial intelligence (AI)-assisted methods increasingly gaining prominence in disease screening. Among them, the depth estimation from endoscopic sequences is crucial for a spectrum of AI-assisted surgical techniques. However, the development of endoscopic depth estimation algorithms presents a formidable challenge due to the unique environmental intricacies and constraints within the dataset. This paper proposes a self-supervised depth estimation network to comprehensively explore the brightness changes in endoscopic images, and fuse different features at multiple levels to achieve an accurate prediction of endoscopic depth. First, a FlowNet is designed to evaluate the brightness changes of adjacent frames by calculating the multi-scale structural similarity. Second, a feature fusion module is presented to capture multi-scale contextual information. Experiments show that the average accuracy of the algorithm is 97.03% in the Stereo Correspondence and Reconstruction of Endoscopic Data (SCARED dataset). Based on the training parameters of the SCARED dataset, the algorithm achieves superior performance on the other two datasets (EndoSLAM and KVASIR dataset), indicating that the algorithm has good generalization performance.

KW - Accuracy

KW - Brightness

KW - brightness inconsistency

KW - depth estimation

KW - Endoscopes

KW - Estimation

KW - feature fusion

KW - self-supervised learning

KW - Surgery

KW - Surgical vision

KW - Three-dimensional displays

KW - Training

UR - http://www.scopus.com/inward/record.url?scp=85200269440&partnerID=8YFLogxK

U2 - 10.1109/JBHI.2024.3434372

DO - 10.1109/JBHI.2024.3434372

M3 - Article

C2 - 39074004

AN - SCOPUS:85200269440

SN - 2168-2194

SP - 1

EP - 11

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

ER -

Self-Supervised Monocular Depth Estimation for Endoscopic Imaging

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this