• Corpus ID: 245537311

Human View Synthesis using a Single Sparse RGB-D Input

@article{Nguyen2021HumanVS,
  title={Human View Synthesis using a Single Sparse RGB-D Input},
  author={Phong Nguyen and Nikolaos Sarafianos and Christoph Lassner and J. Heikkila and Tony Tung},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.13889}
}
Novel view synthesis for humans in motion is a challenging computer vision problem that enables applications such as free-viewpoint video. Existing methods typically use complex setups with multiple input views, 3D supervision or pre-trained models that do not generalize well to new identities. Aiming to address these limitations, we present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D, similar… 

Figures and Tables from this paper

Learning Dynamic View Synthesis With Few RGBD Cameras
TLDR
This work proposes to utilize RGBD cameras to remove limitations and synthesize free-viewpoint videos of dynamic indoor scenes and introduces a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
Animatable Neural Radiance Fields from Monocular RGB-D
TLDR
This paper introduces a novel method to integrate observations across frames and encode the appearance at each individual frame by utilizing the human pose that models the body shape and point clouds which cover partial part of the human as the input.

References

SHOWING 1-10 OF 78 REFERENCES
Deep Volumetric Video From Very Sparse Multi-view Performance Capture
TLDR
This work focuses on the task of template-free, per-frame 3D surface reconstruction from as few as three RGB sensors, for which conventional visual hull or multi-view stereo methods fail to generate plausible results.
Free View Synthesis
TLDR
This work presents a method for novel view synthesis from input images that are freely distributed around a scene that can synthesize images for free camera movement through the scene, and works for general scenes with unconstrained geometric layouts.
SynSin: End-to-End View Synthesis From a Single Image
TLDR
This work proposes a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view and outperforms baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.
View Synthesis by Appearance Flow
TLDR
This work addresses the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints and shows that for both objects and scenes, this approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
TLDR
Neural Body is proposed, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.
Stereo Magnification: Learning View Synthesis using Multiplane Images
TLDR
This paper explores an intriguing scenario for view synthesis: extrapolating views from imagery captured by narrow-baseline stereo cameras, including VR cameras and now-widespread dual-lens camera phones, and proposes a learning framework that leverages a new layered representation that is called multiplane images (MPIs).
IBRNet: Learning Multi-View Image-Based Rendering
TLDR
A method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views using a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations.
LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering
TLDR
The novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real- time is taken.
RGBD-Net: Predicting Color and Depth Images for Novel Views Synthesis
TLDR
RGBD-Net not only produces novel views with higher quality than the previous state-of-the-art methods, but also the obtained depth maps enable reconstruction of more accurate 3D point clouds than the existing multi-view stereo methods.
Learning-based view synthesis for light field cameras
TLDR
This paper proposes a novel learning-based approach to synthesize new views from a sparse set of input views that could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.
...
...