Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views

  title={Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views},
  author={Pierre Zins and Yuanlu Xu and Edmond Boyer and Stefanie Wuhrer and Tony Tung},
  journal={2021 International Conference on 3D Vision (3DV)},
Recently, data-driven single-view reconstruction methods have shown great progress in modeling 3D dressed humans. However, such methods suffer heavily from depth ambiguities and occlusions inherent to single view inputs. In this paper, we tackle this problem by considering a small set of input views and investigate the best strategy to suitably exploit information from these views. We propose a data-driven end-to-end approach that reconstructs an implicit 3D representation of dressed humans… 
FLEX: Parameter-free Multi-view 3D Human Motion Reconstruction
This work introduces FLEX (Free muLti-view rEconstruXion), an end-to-end parameter-free multi-view model that outperforms state-of-the-art methods that are not parameter- free and shows that in the absence of camera parameters, it outperforms them by a large margin while obtaining comparable results when camera parameters are available.
FLEX: Extrinsic Parameter-free Multi-view 3D Human Motion Reconstruction
FLEX (Free muLti-view rEconstruXion), an end-to-end extrinsic parameter-free multi-view model that outperforms state-of-the-art methods that are not ep-free and shows that in the absence of camera parameters, it outperforms them by a large margin while obtaining comparable results when camera parameters are available.
CoNeRF: Controllable Neural Radiance Fields
This work demonstrates for the first time novel view and novel attribute re-rendering of scenes from a single video by treating the attributes as latent variables that are regressed by the neural network given the scene encoding.
BodyMap: Learning Full-Body Dense Correspondence Map
BodyMap is presented, a new framework for obtaining high-definition full-body and continuous dense correspondence between in-the-wild images of clothed humans and the surface of a 3D template model that outperforms prior work on various metrics and datasets, including DensePose-COCO by a large margin.


Learning Non-Volumetric Depth Fusion Using Successive Reprojections
  • S. Donné, Andreas Geiger
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
This work proposes to learn an auto-regressive depth refinement directly from data to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.
Multi-person Implicit Reconstruction from a Single Image
This work introduces the first end-to-end learning approach to perform model-free implicit reconstruction for realistic 3D capture of multiple clothed people in arbitrary poses (with occlusions) from a single image.
Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency
The ability of learning-based strategies to effectively benefit the reconstruction of arbitrary shapes with improved precision and robustness is investigated, showing that a CNN, trained on a standard static dataset, can help recover surface details on dynamic scenes that are not perceived by traditional 2D feature based methods.
Learning to Estimate 3D Human Pose and Shape from a Single Color Image
This work addresses the problem of estimating the full body 3D human pose and shape from a single color image and proposes an efficient and effective direct prediction method based on ConvNets, incorporating a parametric statistical body shape model (SMPL) within an end-to-end framework.
DeepHuman: 3D Human Reconstruction From a Single Image
DeepHuman, an image-guided volume-to-volume translation CNN for 3D human reconstruction from a single RGB image, leverages a dense semantic representation generated from SMPL model as an additional input to reduce the ambiguities associated with the reconstruction of invisible areas.
Deep Volumetric Video From Very Sparse Multi-view Performance Capture
This work focuses on the task of template-free, per-frame 3D surface reconstruction from as few as three RGB sensors, for which conventional visual hull or multi-view stereo methods fail to generate plausible results.
ARCH: Animatable Reconstruction of Clothed Humans
This paper proposes ARCH (Animatable Reconstruction of Clothed Humans), a novel end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image and shows numerous qualitative examples of animated, high-quality reconstructed avatars unseen in the literature so far.
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
This work formulates a multi-level architecture that is end-to-end trainable and significantly outperforms existing state-of-the-art techniques on single image human shape reconstruction by fully leveraging 1k-resolution input images.
DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare
A novel end-to-end framework for jointly estimating 3D human pose and body shape from a monocular RGB image and a large-scale synthetic dataset utilizing web-crawled Mocap sequences, 3D scans and animations is constructed.
Moulding Humans: Non-Parametric 3D Human Shape Estimation From Single Images
This work proposes a non-parametric approach that employs a double depth map to represent the 3D shape of a person: a visible depth map and a ``hidden'' depth map are estimated and combined, to reconstruct the human3D shape as done with a ``mould''.