Novel view synthesis with wide-baseline stereo pairs based on local–global information

Kai Song, Lei Zhang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Novel view synthesis generates images from new views using multiple images of a scene in known views. Using wide-baseline stereo image pairs for novel view synthesis allows scenes to be rendered from varied perspectives with only two images, significantly reducing image acquisition and storage costs and improving 3D scene reconstruction efficiency. However, the large geometry difference and severe occlusion between a pair of wide-baseline stereo images often cause artifacts and holes in the novel view images. To address these issues, we propose a method that integrates both local and global information for synthesizing novel view images from wide-baseline stereo image pairs. Initially, our method aggregates cost volume with local information using Convolutional Neural Network (CNN) and employs Transformer to capture global features. This process optimizes disparity prediction for improving the depth prediction and reconstruction quality of 3D scene representations with wide-baseline stereo image pairs. Subsequently, our method uses CNN to capture local semantic information and Transformer to model long-range contextual dependencies, generating high-quality novel view images. Extensive experiments demonstrate that our method can effectively reduce artifacts and holes, thereby enhancing the synthesis quality of novel views from wide-baseline stereo image pairs.

Original languageEnglish
Article number104139
JournalComputers and Graphics (Pergamon)
Volume126
DOIs
Publication statusPublished - Feb 2025

Keywords

  • Depth prediction
  • Novel view synthesis
  • Warping
  • Wide-baseline stereo image pair

Fingerprint

Dive into the research topics of 'Novel view synthesis with wide-baseline stereo pairs based on local–global information'. Together they form a unique fingerprint.

Cite this