TY - GEN
T1 - Multi-view Consistency View Synthesis
AU - Wu, Xiaodi
AU - Zhang, Zhiqiang
AU - Yu, Wenxin
AU - Chen, Shiyu
AU - Gao, Yufei
AU - Chen, Peng
AU - Gong, Jun
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2024
Y1 - 2024
N2 - Novel view synthesis (NVS) aims to synthesize photo-realistic images depicting a scene by utilizing existing source images. The core objective is that the synthesized images are supposed to be as close as possible to the scene content. In recent years, various approaches shift the focus towards the visual effect of images in continuous space or time. While current methods for static scenes treat the rendering of images as isolated processes, neglecting the geometric consistency in static scenes. This usually results in incoherent visual experiences like flicker or artifacts in synthesized image sequences. To address this limitation, we propose Multi-View Consistency View Synthesis (MCVS). MCVS leverages long short-term memory (LSTM) and self-attention mechanism to model the spatial correlation between synthesized images, hence forcing them closer to the ground truth. MCVS not only enhances multi-view consistency but also improves the overall quality of the synthesized images. The proposed method is evaluated on the Tanks and Temples dataset, and the FVS dataset. On average, the Learned Perceptual Image Patch Similarity (LPIPS) is better than state-of-the-art approaches by 0.14 to 0.16%, indicating the superiority of our approach.
AB - Novel view synthesis (NVS) aims to synthesize photo-realistic images depicting a scene by utilizing existing source images. The core objective is that the synthesized images are supposed to be as close as possible to the scene content. In recent years, various approaches shift the focus towards the visual effect of images in continuous space or time. While current methods for static scenes treat the rendering of images as isolated processes, neglecting the geometric consistency in static scenes. This usually results in incoherent visual experiences like flicker or artifacts in synthesized image sequences. To address this limitation, we propose Multi-View Consistency View Synthesis (MCVS). MCVS leverages long short-term memory (LSTM) and self-attention mechanism to model the spatial correlation between synthesized images, hence forcing them closer to the ground truth. MCVS not only enhances multi-view consistency but also improves the overall quality of the synthesized images. The proposed method is evaluated on the Tanks and Temples dataset, and the FVS dataset. On average, the Learned Perceptual Image Patch Similarity (LPIPS) is better than state-of-the-art approaches by 0.14 to 0.16%, indicating the superiority of our approach.
KW - Deep Learning
KW - Long Short-Term Memory Mechanism
KW - Novel View Synthesis
UR - http://www.scopus.com/inward/record.url?scp=85178621328&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-8148-9_25
DO - 10.1007/978-981-99-8148-9_25
M3 - Conference contribution
AN - SCOPUS:85178621328
SN - 9789819981472
T3 - Communications in Computer and Information Science
SP - 311
EP - 323
BT - Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings
A2 - Luo, Biao
A2 - Cheng, Long
A2 - Wu, Zheng-Guang
A2 - Li, Hongyi
A2 - Li, Chaojie
PB - Springer Science and Business Media Deutschland GmbH
T2 - 30th International Conference on Neural Information Processing, ICONIP 2023
Y2 - 20 November 2023 through 23 November 2023
ER -