Novel View Synthesis from a Single RGBD Image for Indoor Scenes

Congrui Hetang; Yuping Wang

doi:10.1109/ICICML60161.2023.10424939

Novel View Synthesis from a Single RGBD Image for Indoor Scenes

Congrui Hetang^*, Yuping Wang

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input. Novel view synthesis (NVS) is an interesting computer vision task with extensive applications. Methods using multiple images has been well-studied, exemplary ones include training scene-specific Neural Radiance Fields (NeRF), or leveraging multi-view stereo (MVS) and 3D rendering pipelines. However, both are either computationally intensive or non-generalizable across different scenes, limiting their practical value. Conversely, the depth information embedded in RGBD images unlocks 3D potential from a singular view, simplifying NVS. The widespread availability of compact, affordable stereo cameras, and even LiDARs in contemporary devices like smartphones, makes capturing RGBD images more accessible than ever. In our method, we convert an RGBD image into a point cloud and render it from a different viewpoint, then formulate the NVS task into an image translation problem. We leveraged generative adversarial networks to style-transfer the rendered image, achieving a result similar to a photograph taken from the new perspective. We explore both unsupervised learning using CycleGAN and supervised learning with Pix2Pix, and demonstrate the qualitative results. Our method circumvents the limitations of traditional multi-image techniques, holding significant promise for practical, real-time applications in NVS.

Original language	English
Title of host publication	2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	447-450
Number of pages	4
ISBN (Electronic)	9798350331417
DOIs	https://doi.org/10.1109/ICICML60161.2023.10424939
Publication status	Published - 2023
Externally published	Yes
Event	2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023 - Hybrid, Chengdu, China Duration: 3 Nov 2023 → 5 Nov 2023

Publication series

Name	2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023

Conference

Conference	2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023
Country/Territory	China
City	Hybrid, Chengdu
Period	3/11/23 → 5/11/23

Keywords

3D reconstruction
Generative Adversarial Network
Image Style Transfer
Novel View Synthesis

Access to Document

10.1109/ICICML60161.2023.10424939

Cite this

Hetang, C., & Wang, Y. (2023). Novel View Synthesis from a Single RGBD Image for Indoor Scenes. In 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023 (pp. 447-450). (2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICICML60161.2023.10424939

Hetang, Congrui ; Wang, Yuping. / Novel View Synthesis from a Single RGBD Image for Indoor Scenes. 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023. Institute of Electrical and Electronics Engineers Inc., 2023. pp. 447-450 (2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023).

@inproceedings{f3c5588a5cba453d83bb72f7d7eb3da1,

title = "Novel View Synthesis from a Single RGBD Image for Indoor Scenes",

abstract = "In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input. Novel view synthesis (NVS) is an interesting computer vision task with extensive applications. Methods using multiple images has been well-studied, exemplary ones include training scene-specific Neural Radiance Fields (NeRF), or leveraging multi-view stereo (MVS) and 3D rendering pipelines. However, both are either computationally intensive or non-generalizable across different scenes, limiting their practical value. Conversely, the depth information embedded in RGBD images unlocks 3D potential from a singular view, simplifying NVS. The widespread availability of compact, affordable stereo cameras, and even LiDARs in contemporary devices like smartphones, makes capturing RGBD images more accessible than ever. In our method, we convert an RGBD image into a point cloud and render it from a different viewpoint, then formulate the NVS task into an image translation problem. We leveraged generative adversarial networks to style-transfer the rendered image, achieving a result similar to a photograph taken from the new perspective. We explore both unsupervised learning using CycleGAN and supervised learning with Pix2Pix, and demonstrate the qualitative results. Our method circumvents the limitations of traditional multi-image techniques, holding significant promise for practical, real-time applications in NVS.",

keywords = "3D reconstruction, Generative Adversarial Network, Image Style Transfer, Novel View Synthesis",

author = "Congrui Hetang and Yuping Wang",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023 ; Conference date: 03-11-2023 Through 05-11-2023",

year = "2023",

doi = "10.1109/ICICML60161.2023.10424939",

language = "English",

series = "2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "447--450",

booktitle = "2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023",

address = "United States",

}

Hetang, C & Wang, Y 2023, Novel View Synthesis from a Single RGBD Image for Indoor Scenes. in 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023. 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023, Institute of Electrical and Electronics Engineers Inc., pp. 447-450, 2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023, Hybrid, Chengdu, China, 3/11/23. https://doi.org/10.1109/ICICML60161.2023.10424939

Novel View Synthesis from a Single RGBD Image for Indoor Scenes. / Hetang, Congrui; Wang, Yuping.
2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023. Institute of Electrical and Electronics Engineers Inc., 2023. p. 447-450 (2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Novel View Synthesis from a Single RGBD Image for Indoor Scenes

AU - Hetang, Congrui

AU - Wang, Yuping

PY - 2023

Y1 - 2023

N2 - In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input. Novel view synthesis (NVS) is an interesting computer vision task with extensive applications. Methods using multiple images has been well-studied, exemplary ones include training scene-specific Neural Radiance Fields (NeRF), or leveraging multi-view stereo (MVS) and 3D rendering pipelines. However, both are either computationally intensive or non-generalizable across different scenes, limiting their practical value. Conversely, the depth information embedded in RGBD images unlocks 3D potential from a singular view, simplifying NVS. The widespread availability of compact, affordable stereo cameras, and even LiDARs in contemporary devices like smartphones, makes capturing RGBD images more accessible than ever. In our method, we convert an RGBD image into a point cloud and render it from a different viewpoint, then formulate the NVS task into an image translation problem. We leveraged generative adversarial networks to style-transfer the rendered image, achieving a result similar to a photograph taken from the new perspective. We explore both unsupervised learning using CycleGAN and supervised learning with Pix2Pix, and demonstrate the qualitative results. Our method circumvents the limitations of traditional multi-image techniques, holding significant promise for practical, real-time applications in NVS.

AB - In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input. Novel view synthesis (NVS) is an interesting computer vision task with extensive applications. Methods using multiple images has been well-studied, exemplary ones include training scene-specific Neural Radiance Fields (NeRF), or leveraging multi-view stereo (MVS) and 3D rendering pipelines. However, both are either computationally intensive or non-generalizable across different scenes, limiting their practical value. Conversely, the depth information embedded in RGBD images unlocks 3D potential from a singular view, simplifying NVS. The widespread availability of compact, affordable stereo cameras, and even LiDARs in contemporary devices like smartphones, makes capturing RGBD images more accessible than ever. In our method, we convert an RGBD image into a point cloud and render it from a different viewpoint, then formulate the NVS task into an image translation problem. We leveraged generative adversarial networks to style-transfer the rendered image, achieving a result similar to a photograph taken from the new perspective. We explore both unsupervised learning using CycleGAN and supervised learning with Pix2Pix, and demonstrate the qualitative results. Our method circumvents the limitations of traditional multi-image techniques, holding significant promise for practical, real-time applications in NVS.

KW - 3D reconstruction

KW - Generative Adversarial Network

KW - Image Style Transfer

KW - Novel View Synthesis

UR - http://www.scopus.com/inward/record.url?scp=85186507644&partnerID=8YFLogxK

U2 - 10.1109/ICICML60161.2023.10424939

DO - 10.1109/ICICML60161.2023.10424939

M3 - Conference contribution

AN - SCOPUS:85186507644

T3 - 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023

SP - 447

EP - 450

BT - 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023

Y2 - 3 November 2023 through 5 November 2023

ER -

Hetang C, Wang Y. Novel View Synthesis from a Single RGBD Image for Indoor Scenes. In 2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023. Institute of Electrical and Electronics Engineers Inc. 2023. p. 447-450. (2023 International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023). doi: 10.1109/ICICML60161.2023.10424939

Novel View Synthesis from a Single RGBD Image for Indoor Scenes

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this