Novel View Synthesis via Depth-guided Skip Connections

  title={Novel View Synthesis via Depth-guided Skip Connections},
  author={Yuxin Hou and A. Solin and Juho Kannala},
  journal={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)},
We introduce a principled approach for synthesizing new views of a scene given a single source image. Previous methods for novel view synthesis can be divided into image-based rendering methods (e.g., flow prediction) or pixel generation methods. Flow predictions enable the target view to re-use pixels directly, but can easily lead to distorted results. Directly regressing pixels can produce structurally consistent results but generally suffer from the lack of low-level details. In this paper… 

Figures and Tables from this paper


Monocular Neural Image Based Rendering With Continuous View Control
The experiments show that both proposed components, the transforming encoder-decoder and depth-guided appearance mapping, lead to significantly improved generalization beyond the training views and in consequence to more accurate view synthesis under continuous 6-DoF camera control.
View Synthesis by Appearance Flow
This work addresses the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints and shows that for both objects and scenes, this approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
Multi-view to Novel View: Synthesizing Novel Views With Self-learned Confidence
This paper proposes an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision, and introduces a self-learned confidence aggregation mechanism.
Single-View View Synthesis With Multiplane Images
This work learns to predict a multiplane image directly from a single image input, and introduces scale-invariant view synthesis for supervision, enabling it to train on online video.
Transformation-Grounded Image Generation Network for Novel 3D View Synthesis
We present a transformation-grounded image generation network for novel 3D view synthesis from a single image. Our approach first explicitly infers the parts of the geometry visible both in the input
SynSin: End-to-End View Synthesis From a Single Image
This work proposes a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view and outperforms baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.
Stereo Magnification: Learning View Synthesis using Multiplane Images
This paper explores an intriguing scenario for view synthesis: extrapolating views from imagery captured by narrow-baseline stereo cameras, including VR cameras and now-widespread dual-lens camera phones, and proposes a learning framework that leverages a new layered representation that is called multiplane images (MPIs).
Extreme View Synthesis
This work presents Extreme View Synthesis, a solution for novel view extrapolation that works even when the number of input images is small---as few as two, and is the first to show visually pleasing results for baseline magnifications of up to 30x.
Visual Object Networks: Image Generation with Disentangled 3D Representations
A new generative model, Visual Object Networks (VONs), synthesizing natural images of objects with a disentangled 3D representation that enables many 3D operations such as changing the viewpoint of a generated image, shape and texture editing, linear interpolation in texture and shape space, and transferring appearance across different objects and viewpoints.
Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency
A differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view is proposed by reformulating view consistency using a differentiable ray consistency (DRC) term and it is shown that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations.