Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network

Shangwei Guo; Zhengchao Lai; Jun Li; Shaokun Han

doi:10.1016/j.optlaseng.2022.107224

Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network

Shangwei Guo, Zhengchao Lai, Jun Li, Shaokun Han^*

^*此作品的通讯作者

光电学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

3D reconstruction from the sparse and noisy photon-efficient measurement Time-of-Arrival (ToA) cube is challenging because the effective echo signal occupies only a small part of the time channel and the rest of the time channel contains only noise. All learning-based photon-efficient 3D reconstruction methods extract features over the entire time channel of ToA. However, extracting features from the entire time channel causes the learned features to be mainly defined by noise accounting for most of the time channel, thereby reducing the 3D reconstruction accuracy. In this paper, we propose a Coarse-to-Fine neural network, where the coarse part eliminates the invalid noisy time bins and the fine part extracts features on the remaining time bins containing only effective echo signals. Specifically, due to locating the interval to which the effective echo signal belongs, non-local spatial-temporal features of the ToA cube must be captured. To this end, we propose a transformer-based Coarse-Interval-Localization-Network (CILN), which holds the global receptive field to aggregate features from long-distance time bins. Then, the located interval containing only the effective echo signal is cropped from the ToA cube and input to the proposed Fine-Maximum-Localization-Network (FMLN) to locate the maximum of the echo signal. Because the cropping operation destroys the distribution of the original signal, we propose the position encoding module to transmit the distribution change information to high dimensional feature space in the FMLN. Furthermore, we propose the temporal attention module to guide the FMLN to pay more attention to the useful signal. Compared with other methods that extract features over the entire time channel, the coarse-to-fine configuration of our method eliminates the time bins containing invalid noise through the coarse part and reduces the influence of the noise on the feature extraction of the fine part, and therefore the performance of the network on the reconstruction accuracy is improved. We conduct multiple experiments on the simulated data and real-world data, and the experimental results show that the proposed Coarse-to-Fine neural network can achieve state-of-the-art performance.

源语言	英语
文章编号	107224
期刊	Optics and Lasers in Engineering
卷	159
DOI	https://doi.org/10.1016/j.optlaseng.2022.107224
出版状态	已出版 - 12月 2022

访问文件

10.1016/j.optlaseng.2022.107224

其它文件与链接

链接到 Scopus 的出版物

引用此

Guo, S., Lai, Z., Li, J., & Han, S. (2022). Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network. Optics and Lasers in Engineering, 159, 文章 107224. https://doi.org/10.1016/j.optlaseng.2022.107224

@article{9f95917cb109414eadb8e06b05b0d47f,

title = "Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network",

abstract = "3D reconstruction from the sparse and noisy photon-efficient measurement Time-of-Arrival (ToA) cube is challenging because the effective echo signal occupies only a small part of the time channel and the rest of the time channel contains only noise. All learning-based photon-efficient 3D reconstruction methods extract features over the entire time channel of ToA. However, extracting features from the entire time channel causes the learned features to be mainly defined by noise accounting for most of the time channel, thereby reducing the 3D reconstruction accuracy. In this paper, we propose a Coarse-to-Fine neural network, where the coarse part eliminates the invalid noisy time bins and the fine part extracts features on the remaining time bins containing only effective echo signals. Specifically, due to locating the interval to which the effective echo signal belongs, non-local spatial-temporal features of the ToA cube must be captured. To this end, we propose a transformer-based Coarse-Interval-Localization-Network (CILN), which holds the global receptive field to aggregate features from long-distance time bins. Then, the located interval containing only the effective echo signal is cropped from the ToA cube and input to the proposed Fine-Maximum-Localization-Network (FMLN) to locate the maximum of the echo signal. Because the cropping operation destroys the distribution of the original signal, we propose the position encoding module to transmit the distribution change information to high dimensional feature space in the FMLN. Furthermore, we propose the temporal attention module to guide the FMLN to pay more attention to the useful signal. Compared with other methods that extract features over the entire time channel, the coarse-to-fine configuration of our method eliminates the time bins containing invalid noise through the coarse part and reduces the influence of the noise on the feature extraction of the fine part, and therefore the performance of the network on the reconstruction accuracy is improved. We conduct multiple experiments on the simulated data and real-world data, and the experimental results show that the proposed Coarse-to-Fine neural network can achieve state-of-the-art performance.",

keywords = "LiDAR, Neural Network, Photon-Efficient 3D Reconstruction, Single-Photon LiDAR",

author = "Shangwei Guo and Zhengchao Lai and Jun Li and Shaokun Han",

note = "Publisher Copyright: {\textcopyright} 2022",

year = "2022",

month = dec,

doi = "10.1016/j.optlaseng.2022.107224",

language = "English",

volume = "159",

journal = "Optics and Lasers in Engineering",

issn = "0143-8166",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network

AU - Guo, Shangwei

AU - Lai, Zhengchao

AU - Li, Jun

AU - Han, Shaokun

PY - 2022/12

Y1 - 2022/12

N2 - 3D reconstruction from the sparse and noisy photon-efficient measurement Time-of-Arrival (ToA) cube is challenging because the effective echo signal occupies only a small part of the time channel and the rest of the time channel contains only noise. All learning-based photon-efficient 3D reconstruction methods extract features over the entire time channel of ToA. However, extracting features from the entire time channel causes the learned features to be mainly defined by noise accounting for most of the time channel, thereby reducing the 3D reconstruction accuracy. In this paper, we propose a Coarse-to-Fine neural network, where the coarse part eliminates the invalid noisy time bins and the fine part extracts features on the remaining time bins containing only effective echo signals. Specifically, due to locating the interval to which the effective echo signal belongs, non-local spatial-temporal features of the ToA cube must be captured. To this end, we propose a transformer-based Coarse-Interval-Localization-Network (CILN), which holds the global receptive field to aggregate features from long-distance time bins. Then, the located interval containing only the effective echo signal is cropped from the ToA cube and input to the proposed Fine-Maximum-Localization-Network (FMLN) to locate the maximum of the echo signal. Because the cropping operation destroys the distribution of the original signal, we propose the position encoding module to transmit the distribution change information to high dimensional feature space in the FMLN. Furthermore, we propose the temporal attention module to guide the FMLN to pay more attention to the useful signal. Compared with other methods that extract features over the entire time channel, the coarse-to-fine configuration of our method eliminates the time bins containing invalid noise through the coarse part and reduces the influence of the noise on the feature extraction of the fine part, and therefore the performance of the network on the reconstruction accuracy is improved. We conduct multiple experiments on the simulated data and real-world data, and the experimental results show that the proposed Coarse-to-Fine neural network can achieve state-of-the-art performance.

AB - 3D reconstruction from the sparse and noisy photon-efficient measurement Time-of-Arrival (ToA) cube is challenging because the effective echo signal occupies only a small part of the time channel and the rest of the time channel contains only noise. All learning-based photon-efficient 3D reconstruction methods extract features over the entire time channel of ToA. However, extracting features from the entire time channel causes the learned features to be mainly defined by noise accounting for most of the time channel, thereby reducing the 3D reconstruction accuracy. In this paper, we propose a Coarse-to-Fine neural network, where the coarse part eliminates the invalid noisy time bins and the fine part extracts features on the remaining time bins containing only effective echo signals. Specifically, due to locating the interval to which the effective echo signal belongs, non-local spatial-temporal features of the ToA cube must be captured. To this end, we propose a transformer-based Coarse-Interval-Localization-Network (CILN), which holds the global receptive field to aggregate features from long-distance time bins. Then, the located interval containing only the effective echo signal is cropped from the ToA cube and input to the proposed Fine-Maximum-Localization-Network (FMLN) to locate the maximum of the echo signal. Because the cropping operation destroys the distribution of the original signal, we propose the position encoding module to transmit the distribution change information to high dimensional feature space in the FMLN. Furthermore, we propose the temporal attention module to guide the FMLN to pay more attention to the useful signal. Compared with other methods that extract features over the entire time channel, the coarse-to-fine configuration of our method eliminates the time bins containing invalid noise through the coarse part and reduces the influence of the noise on the feature extraction of the fine part, and therefore the performance of the network on the reconstruction accuracy is improved. We conduct multiple experiments on the simulated data and real-world data, and the experimental results show that the proposed Coarse-to-Fine neural network can achieve state-of-the-art performance.

KW - LiDAR

KW - Neural Network

KW - Photon-Efficient 3D Reconstruction

KW - Single-Photon LiDAR

UR - http://www.scopus.com/inward/record.url?scp=85135913614&partnerID=8YFLogxK

U2 - 10.1016/j.optlaseng.2022.107224

DO - 10.1016/j.optlaseng.2022.107224

M3 - Article

AN - SCOPUS:85135913614

SN - 0143-8166

VL - 159

JO - Optics and Lasers in Engineering

JF - Optics and Lasers in Engineering

M1 - 107224

ER -

Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network

摘要

访问文件

其它文件与链接

指纹

引用此