• Corpus ID: 199502034

Semantic Estimation of 3D Body Shape and Pose using Minimal Cameras

@article{Gilbert2020SemanticEO,
  title={Semantic Estimation of 3D Body Shape and Pose using Minimal Cameras},
  author={Andrew Gilbert and Matthew Trumble and Adrian Hilton and John P. Collomosse},
  journal={ArXiv},
  year={2020},
  volume={abs/1908.03030}
}
We present an approach to accurately estimate high fidelity markerless 3D pose and volumetric reconstruction of human performance using only a small set of camera views ($\sim 2$). Our method utilises a dual loss in a generative adversarial network that can yield improved performance in both reconstruction and pose estimate error. We use a deep prior implicitly learnt by the network trained over a dataset of view-ablated multi-view video footage of a wide range of subjects and actions. Uniquely… 

References

SHOWING 1-10 OF 69 REFERENCES

Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling

TLDR
A symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape is trained.

Towards Accurate Marker-Less Human Shape and Pose Estimation over Time

  • Yinghao Huang
  • Computer Science
    2017 International Conference on 3D Vision (3DV)
  • 2017
TLDR
This work presents a fully automatic method that, given multi-view videos, estimates 3D human pose and body shape and takes the recently proposed SMPLify method as the base method and extends it in several ways.

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video

TLDR
This paper addresses the challenge of 3D full-body human pose estimation from a monocular image sequence with a novel approach that integrates a sparsity-driven 3D geometric prior and temporal smoothness and outperforms a publicly available 2D pose estimation baseline on the challenging PennAction dataset.

Exploiting Temporal Information for 3D Human Pose Estimation

TLDR
A sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training is designed, which helps the network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.

Fusing Visual and Inertial Sensors with Semantics for 3D Human Pose Estimation

TLDR
A multi-channel 3D convolutional neural network is used to learn a pose embedding from visual occupancy and semantic 2D pose estimates from the MVV in a discretised volumetric probabilistic visual hull, yielding improved accuracy over prior methods.

Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors

TLDR
An algorithm for fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data to accurately estimate 3D human pose is presented, yielding improved accuracy over prior methods.

Indirect deep structured learning for 3D human body shape and pose prediction

TLDR
A novel encoder-decoder architecture for 3D body shape and pose prediction using SMPL (a statistical body shape model) parameters as an input and corresponding training procedure as well as quantitative and qualitative analysis of the proposed method on artificial and real image datasets.

Total Capture : 3 D Human Pose Estimation Fusing Video and Inertial Sensors

TLDR
An algorithm for fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data to accurately estimate 3D human pose is presented, yielding improved accuracy over prior methods.

Volumetric performance capture from minimal camera viewpoints

We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views. Our

Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera

TLDR
This work proposes a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild and obtains an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in theWild.
...