DEMVSNet: Denoising and depth inference for unstructured multi-view stereo on noised images

Jiawei Han; Xiaomei Chen; Yongtian Zhang; Weimin Hou; Zibo Hu

doi:10.1049/cvi2.12102

DEMVSNet: Denoising and depth inference for unstructured multi-view stereo on noised images

Jiawei Han, Xiaomei Chen^*, Yongtian Zhang, Weimin Hou, Zibo Hu

^*此作品的通讯作者

光电学院

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

Most deep-learning-based multi-view stereo series studies are concerned with improving the depth prediction accuracy of noise-free images. However, it is difficult to obtain off-the-set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi-stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors’ work surpasses the state-of-the-art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.

源语言	英语
页（从-至）	570-580
页数	11
期刊	IET Computer Vision
卷	16
期	7
DOI	https://doi.org/10.1049/cvi2.12102
出版状态	已出版 - 10月 2022

访问文件

10.1049/cvi2.12102

其它文件与链接

链接到 Scopus 的出版物

引用此

Han, J., Chen, X., Zhang, Y., Hou, W., & Hu, Z. (2022). DEMVSNet: Denoising and depth inference for unstructured multi-view stereo on noised images. IET Computer Vision, 16(7), 570-580. https://doi.org/10.1049/cvi2.12102

@article{aaa16adadb284788864994bfbca19ab5,

title = "DEMVSNet: Denoising and depth inference for unstructured multi-view stereo on noised images",

abstract = "Most deep-learning-based multi-view stereo series studies are concerned with improving the depth prediction accuracy of noise-free images. However, it is difficult to obtain off-the-set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi-stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors{\textquoteright} work surpasses the state-of-the-art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.",

keywords = "computer vision, neural net architecture, random noise",

author = "Jiawei Han and Xiaomei Chen and Yongtian Zhang and Weimin Hou and Zibo Hu",

note = "Publisher Copyright: {\textcopyright} 2022 The Authors. IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.",

year = "2022",

month = oct,

doi = "10.1049/cvi2.12102",

language = "English",

volume = "16",

pages = "570--580",

journal = "IET Computer Vision",

issn = "1751-9632",

publisher = "John Wiley & Sons Inc.",

number = "7",

}

TY - JOUR

T1 - DEMVSNet

T2 - Denoising and depth inference for unstructured multi-view stereo on noised images

AU - Han, Jiawei

AU - Chen, Xiaomei

AU - Zhang, Yongtian

AU - Hou, Weimin

AU - Hu, Zibo

PY - 2022/10

Y1 - 2022/10

N2 - Most deep-learning-based multi-view stereo series studies are concerned with improving the depth prediction accuracy of noise-free images. However, it is difficult to obtain off-the-set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi-stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors’ work surpasses the state-of-the-art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.

AB - Most deep-learning-based multi-view stereo series studies are concerned with improving the depth prediction accuracy of noise-free images. However, it is difficult to obtain off-the-set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi-stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors’ work surpasses the state-of-the-art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.

KW - computer vision

KW - neural net architecture

KW - random noise

UR - http://www.scopus.com/inward/record.url?scp=85127975478&partnerID=8YFLogxK

U2 - 10.1049/cvi2.12102

DO - 10.1049/cvi2.12102

M3 - Article

AN - SCOPUS:85127975478

SN - 1751-9632

VL - 16

SP - 570

EP - 580

JO - IET Computer Vision

JF - IET Computer Vision

IS - 7

ER -

DEMVSNet: Denoising and depth inference for unstructured multi-view stereo on noised images

摘要

访问文件

其它文件与链接

指纹

引用此