LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering

  title={LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering},
  author={Ricardo Martin-Brualla and Rohit Pandey and Shuoran Yang and Pavel Pidlypenskyi and Jonathan Taylor and Julien P. C. Valentin and S. Khamis and Philip L. Davidson and Anastasia Tkach and Peter Lincoln and Adarsh Kowdle and Christoph Rhemann and Dan B. Goldman and Cem Keskin and Steven M. Seitz and Shahram Izadi and S. Fanello},
  journal={ACM Trans. Graph.},
Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. [] Key Method We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized…
LookinGood^π: Real-time Person-independent Neural Re-rendering for High-quality Human Performance Capture
We propose LookinGood , a novel neural re-rendering approach that is aimed to (1) improve the rendering quality of the low-quality reconstructed results from human performance capture system in
NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
  • Xin Suo, Yuheng Jiang, Lan Xu
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
A real-time neural human performance capture and rendering system to generate both high-quality geometry and photo-realistic texture of human activities in arbitrary novel views and adopts neural normal blending to enhance geometry details and formulate the authors' neural geometry and texture rendering into a multi-task learning framework.
Real-Time Neural Character Rendering with Pose-Guided Multiplane Images
This work proposes pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality and demonstrates advanta-geous novel-view synthesis quality over the state-of-the-art approaches for characters with challenging motions.
Deep relightable textures
This paper proposes a system that combines traditional geometric pipelines with a neural rendering scheme to generate photorealistic renderings of dynamic performances under desired viewpoint and lighting, significantly outperforming the existing state-of-the-art solutions.
Rig-space Neural Rendering
The idea is to render the character in many different poses and views, and to train a deep neural network to render high resolution images, from the rig parameters directly, to learn a compact image-based rendering of the original 3d character.
Rig-space Neural Rendering: Compressing the Rendering of Characters for Previs, Real-time Animation and High-quality Asset Re-use
The model learns to render an image directly from the rig parameters at a high resolution, and extends the architecture to support dynamic re-lighting and composition with other objects in the scene.
Neural volumes
This work presents a learning-based approach to representing dynamic objects inspired by the integral projection model used in tomographic imaging, and learns a latent representation of a dynamic scene that enables us to produce novel content sequences not seen during training.
Few-shot Neural Human Performance Rendering from Sparse RGBD Videos
This paper proposes a few-shot neural human rendering approach (FNHR) from only sparse RGBD inputs, which exploits the temporal and spatial redundancy to generate photo-realistic free-view output of human activities.
NeuralFusion: Neural Volumetric Rendering under Human-object Interactions
The proposed NeuralHOFusion, a neural approach for volumetric human-object capture and rendering using sparse consumer RGBD sensors, marries traditional non-rigid fusion with recent neural implicit modeling and blending advances, where the captured humans and objects are layer-wise disentangled.
Artemis: Articulated Neural Pets with Appearance and Motion synthesis
control, real-time animation, and photo-realistic rendering of furry animals. The core of our ARTEMIS is a neural-generated (NGI) animal engine, which adopts an efficient octree-based representation


Fusion4D: real-time performance capture of challenging scenes
This work contributes a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time, highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes.
Motion2fusion: real-time volumetric performance capture
This work provides three major contributions over prior work: a new non-rigid fusion pipeline allowing for far more faithful reconstruction of high frequency geometric details, avoiding the over-smoothing and visual artifacts observed previously, a high speed pipeline coupled with a machine learning technique for 3D correspondence field estimation reducing tracking errors and artifacts that are attributed to fast motions.
Montage4D: interactive seamless fusion of multiview video textures
This paper builds on the ideas of dilated depth discontinuities and majority voting from Holoportation to reduce ghosting effects when blending textures, and determines the appropriate blend of textures per vertex using view-dependent rendering techniques, to avert fuzziness caused by the ubiquitous normal-weighted blending.
ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems
This paper presents ActiveStereoNet, the first deep learning solution for active stereo systems that is fully self-supervised, yet it produces precise depth with a subpixel precision; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions.
StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free
High-quality video view interpolation using a layered representation
This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
The need 4 speed in real-time dense visual tracking
This paper proposes a novel combination of hardware and software components that avoids the need to compromise between a dense accurate depth map and a high frame rate, and proposes a machine learning based depth refinement step that is an order of magnitude faster than traditional postprocessing methods.
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
This work considers image transformation problems, and proposes the use of perceptual loss functions for training feed-forward networks for image transformation tasks, and shows results on image style transfer, where aFeed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time.
Jump: virtual reality video
The distortions inherent to ODS when used for VR display as well as those introduced by the capture method are discovered and analyzed to show that they are small enough to make this approach suitable for capturing a wide variety of scenes.
Deep Stereo: Learning to Predict New Views from the World's Imagery
This work presents a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets, and is the first to apply deep learning to the problem ofnew view synthesis from sets of real-world, natural imagery.