• Corpus ID: 246430332

SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

  title={SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video},
  author={Boyi Jiang and Yang Hong and Hujun Bao and Juyong Zhang},
We propose SelfRecon, a clothed human body reconstruction method that combines implicit and explicit representations to recover space-time coherent geometries from a monocular self-rotating human video. Explicit methods require a predefined template mesh for a given sequence, while the template is hard to acquire for a specific subject. Meanwhile, the fixed topology limits the reconstruction accuracy and clothing types. Implicit representation supports arbitrary topology and can represent high… 

Figures and Tables from this paper

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos
A pose-driven deformation based on the linear blend skinning algorithm, which combines the blend weight and the 3D human skeleton to produce observation-to-canonical correspondences, which outperforms recent human modeling methods.
AvatarGen: a 3D Generative Model for Animatable Human Avatars
AvatarsGen is proposed, the first method that enables not only non-rigid human generation with diverse appearance but also full control over poses and viewpoints, while only requiring 2D images for training and is competent for many applications, e.g. single-view reconstruction, reanimation, and text-guided synthesis.
Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera
This work proposes Neural-DynamicReconstruction (NDR), a template-free method to recover high-fidelity geometry and motions of a dynamic scene from a monocular RGB-D camera that outperforms existing monocular dynamic reconstruction methods.
Generative Neural Articulated Radiance Fields
This work develops a 3D GAN framework that learns to generate radiance of human bodies or faces in a canonical pose and warp them using an explicit deformation into a desired body pose or facial expression and demonstrates the first high-quality radiance generation results for human bodies.
I M Avatar: Implicit Morphable Head Avatars from Videos
This work proposes IMavatar, a novel method for learning implicit head avatars from monocular videos that improves geometry and covers a more complete expression space compared to state-of-the-art methods.


SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
SNARF is introduced, which combines the advantages of linear blend skinning for polygonal meshes with those of neural implicit surfaces by learning a forward deformation field without direct supervision, allowing for generalization to unseen poses.
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance
This work introduces a neural network architecture that simultaneously learns the unknown geometry, camera parameters, and a neural renderer that approximates the light reflected from the surface towards the camera.
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
This work formulates a multi-level architecture that is end-to-end trainable and significantly outperforms existing state-of-the-art techniques on single image human shape reconstruction by fully leveraging 1k-resolution input images.
Implicit Geometric Regularization for Learning Shapes
It is observed that a rather simple loss function, encouraging the neural network to vanish on the input point cloud and to have a unit norm gradient, possesses an implicit geometric regularization property that favors smooth and natural zero level set surfaces, avoiding bad zero-loss solutions.
Video Based Reconstruction of 3D People Models
This paper describes a method to obtain accurate 3D body models and texture of arbitrary people from a single, monocular video in which a person is moving and presents a robust processing pipeline to infer 3D model shapes including clothed people with 4.5mm reconstruction accuracy.
MonoPerfCap: Human Performance Capture from Monocular Video
This work presents the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video that significantly outperforms previous monocular methods in terms of accuracy, robustness and scene complexity that can be handled.
Smpl: A skinned multiperson linear model
  • ACM transactions on graphics (TOG),
  • 2015
PaMIR: Parametric Model-Conditioned Implicit Representation for Image-Based Human Reconstruction
In the PaMIR-based reconstruction framework, a novel deep neural network is proposed to regularize the free-form deep implicit function using the semantic features of the parametric model, which improves the generalization ability under the scenarios of challenging poses and various clothing topologies.
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.
Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
Neural Body is proposed, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.