SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks

@article{Saito2021SCANimateWS,
  title={SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks},
  author={Shunsuke Saito and Jinlong Yang and Qianli Ma and Michael J. Black},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={2885-2896}
}
We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. These avatars are driven by pose parameters and have realistic clothing that moves and deforms naturally. SCANimate does not rely on a customized mesh template or surface mesh registration. We observe that fitting a parametric 3D body model, like SMPL, to a clothed human scan is tractable while surface registration of the body topology to the scan is often… 

Figures and Tables from this paper

PINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D Video Sequence

We present a novel method to learn Personalized Implicit Neural Avatars (PINA) from a short RGB-D sequence. This allows non-expert users to create a detailed and personal-ized virtual copy of

BANMo: Building Animatable 3D Neural Models from Many Casual Videos

This work aims to create high-fidelity, articulated 3D models from many casual RGB videos in a differentiable rendering framework, and introduces neural blend skinning models that allow for differentiable and invertible articulated deformations.

gDNA: Towards Generative Detailed Neural Avatars

A novel method that learns to generate detailed 3D shapes of people in a variety of garments with corresponding skin-ning weights is proposed that can be used on the task of fitting human models to raw scans, out-performing the previous state-of-the-art.

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos

A pose-driven deformation based on the linear blend skinning algorithm, which combines the blend weight and the 3D human skeleton to produce observation-to-canonical correspondences, which outperforms recent human modeling methods.

I M Avatar: Implicit Morphable Head Avatars from Videos

IMavatar (Implicit Morphable avatar), a novel method for learning implicit head avatars from monocular videos, Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, represents the expression- and pose-related deformations via learned blendshapes and skinning fields.

SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes

SNARF is introduced, which combines the advantages of linear blend skinning for polygonal meshes with those of neural implicit surfaces by learning a forward deformation field without direct supervision, allowing for generalization to unseen poses.

Drivable Volumetric Avatars using Texel-Aligned Features

This work proposes an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people, and introduces texel-aligned features—a localised representation which can leverage both the structural prior of a skeleton-based parametric model and observed sparse image signals at the same time.

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements

This work deform surface elements based on a human body model such that large-scale deformations caused by articulation are explicitly separated from topological changes and local clothing deformations, and addresses the limitations of existing neural surface elements by regressing local geometry from local features.

Neural Point-based Shape Modeling of Humans in Challenging Clothing

A coarse stage is extended with a coarse stage, that replaces canonicalization with a learned pose-independent “coarse shape” that can capture the rough surface geometry of clothing like skirts, greatly simplifying the process of creating realistic avatars.

DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks

A three-stage method that induces two inductive biases to better disentangled pose-dependent deformation is introduced and the proposed representation strikes a better trade-off between model ca-pacity, expressiveness, and robustness than competing methods.
...

References

SHOWING 1-10 OF 76 REFERENCES

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements

This work deform surface elements based on a human body model such that large-scale deformations caused by articulation are explicitly separated from topological changes and local clothing deformations, and addresses the limitations of existing neural surface elements by regressing local geometry from local features.

Learning to Dress 3D People in Generative Clothing

This work learns a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing, and is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.

GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models

A statistical, articulated 3D human shape modeling pipeline, within a fully trainable, modular, deep learning framework, that supports facial expression analysis, as well as body shape and pose estimation.

ARCH: Animatable Reconstruction of Clothed Humans

This paper proposes ARCH (Animatable Reconstruction of Clothed Humans), a novel end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image and shows numerous qualitative examples of animated, high-quality reconstructed avatars unseen in the literature so far.

BCNet: Learning Body and Cloth Shape from A Single Image

This paper proposes a layered garment representation on top of SMPL and novelly makes the skinning weight of garment independent of the body mesh, which significantly improves the expression ability of the garment model.

SMPLicit: Topology-aware Generative Model for Clothed People

SMPLicit is introduced, a novel generative model to jointly represent body pose, shape and clothing geometry that can represent in a unified manner different garment topologies while controlling other properties like the garment size or tightness/looseness.

Learning to Reconstruct People in Clothing From a Single RGB Camera

We present Octopus, a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving with a reconstruction accuracy of 4

Detailed, Accurate, Human Shape Estimation from Clothed 3D Scan Sequences

This work contributes a new approach to recover a personalized shape of the person by estimating body shape under clothing from a sequence of 3D scans, which outperforms the state of the art in both pose estimation and shape estimation, qualitatively and quantitatively.

Multi-Garment Net: Learning to Dress 3D People From Images

Multi-Garment Network is presented, a method to predict body shape and clothing, layered on top of the SMPL model from a few frames of a video, allowing to predict garment geometry, relate it to the body shape, and transfer it to new body shapes and poses.

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction

This work presents methodology that combines detail-rich implicit functions and parametric representations in order to reconstruct 3D models of people that remain controllable and accurate even in the presence of clothing and is effective even given incomplete point clouds collected from single-view depth images.
...