Probabilistic Modeling for Human Mesh Recovery

  title={Probabilistic Modeling for Human Mesh Recovery},
  author={Nikos Kolotouros and Georgios Pavlakos and Dinesh Jayaraman and Kostas Daniilidis},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
This paper focuses on the problem of 3D human reconstruction from 2D evidence. Although this is an inherently ambiguous problem, the majority of recent works avoid the uncertainty modeling and typically regress a single estimate for a given input. In contrast to that, in this work, we propose to embrace the reconstruction ambiguity and we recast the problem as learning a mapping from the input to a distribution of plausible 3D poses. Our approach is based on the normalizing flows model and… 

Figures and Tables from this paper

Multi-hypothesis 3D human pose estimation metrics favor miscalibrated distributions

This study identifies that miscalibration can be attributed to the use of sample-based metrics such as minMPJPE and proposes an accurate and well-calibrated model called Conditional Graph Normalizing Flow (cGNFs), which is structured such that a single cGNF can estimate both conditional and marginal densities within the same model – effectively solving a zero-shot density estimation problem.

Recovering 3D Human Mesh from Monocular Images: A Survey

This is the first survey to focus on the task of monocular 3D human mesh recovery and starts with the introduction of body models and then elaborate recovery frameworks and training objectives by providing in-depth analyses of their strengths and weaknesses.

Learning to Fit Morphable Models

This work builds upon recent advances in learned optimization and proposes an update rule inspired by the classic Levenberg–Marquardt algorithm that can be applied to new model fitting problems and offers a competitive alternative to well-tuned model fitting pipelines, both in terms of accuracy and speed.

Learned Vertex Descent: A New Direction for 3D Human Model Fitting

An exhaustive evaluation demonstrates that the proposed novel optimization-based paradigm, dubbed LVD, is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.

Supplementary Material for: Human Mesh Recovery from Multiple Shots

The proposed multi-shot optimization incorporates a temporal smoothness regularization both on the pose parameters and on the 3D joints, which justifies the existence of the two terms during the optimization, and demonstrates that the choice of a transformer-based temporal encoder while being a more appropriate choice Optimization H3.6M.

Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms

This work presents the first large-scale benchmarking of various configurations for mesh recovery tasks from three under-explored perspectives beyond algorithms, and identifies the key strategies and remarks that can signiffcantly enhance the model performance.

Learning Visibility for Robust Dense Human Body Estimation

. Estimating 3D human pose and shape from 2D images is a crucial yet challenging task. While prior methods with model-based representations can perform reasonably well on whole-body images, they

The One Where They Reconstructed 3D Humans and Environments in TV Shows

This paper proposes an automatic approach that operates on an entire season of a TV show and aggregates information in 3D; it builds a 3D model of the environment, compute camera information, static 3D scene structure and body scale information and demonstrates how this information acts as rich 3D context ⋆ Equal contribution.

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

The proposed approach outperforms prior methods significantly in terms of motion infilling and global mesh recovery and presents a global optimization framework that refines the predicted trajectories and optimizes the camera poses to match the video evidence such as 2D keypoints.

Human Mesh Recovery from Multiple Shots

The insight that while shot changes of the same scene incur a discontinuity between frames, the 3D structure of the scene still changes smoothly allows us to handle frames before and after the shot change as multi-view signal that provide strong cues to recover the3D state of the actors.



Hierarchical Kinematic Human Mesh Recovery

A new technique for regression of human parametric model that is explicitly informed by the known hierarchical structure, including joint interdependencies of the model, results in a strong prior-informed design of the regressor architecture and an associated hierarchical optimization that is flexible to be used in conjunction with the current standard frameworks for 3D human mesh recovery.

Convolutional Mesh Regression for Single-Image Human Shape Reconstruction

This paper addresses the problem of 3D human pose and shape estimation from a single image by proposing a graph-based mesh regression, which outperform the comparable baselines relying on model parameter regression, and achieves state-of-the-art results among model-based pose estimation approaches.

Coherent Reconstruction of Multiple Humans From a Single Image

This work addresses the problem of multi-person 3D pose estimation from a single image by incorporating the SMPL parametric body model in a top-down framework and proposing two novel losses that enable more coherent reconstruction in natural images.

Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop

The core of the proposed approach SPIN (SMPL oPtimization IN the loop) is that the two paradigms can form a strong collaboration, and better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.

TexturePose: Supervising Human Mesh Estimation With Texture Consistency

This work proposes a natural form of supervision, that capitalizes on the appearance constancy of a person among different frames (or viewpoints) and achieves state-of-the-art results among model-based pose estimation approaches in different benchmarks.

Human Body Model Fitting by Learned Gradient Descent

A gradient descent algorithm that leverages a neural network to predict the parameter update rule for each iteration, which guides the optimizer towards a good solution in very few steps, converging in typically few steps.

Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose

This paper proposes a fine discretization of the 3D space around the subject and trains a ConvNet to predict per voxel likelihoods for each joint, which creates a natural representation for 3D pose and greatly improves performance over the direct regression of joint coordinates.

Unite the People: Closing the Loop Between 3D and 2D Human Representations

This work proposes a hybrid approach to 3D body model fits for multiple human pose datasets with an extended version of the recently introduced SMPLify method, and shows that UP-3D can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale.

Exploiting Temporal Context for 3D Human Pose Estimation in the Wild

A bundle-adjustment-based algorithm for recovering accurate 3D human pose and meshes from monocular videos and shows that retraining a single-frame 3D pose estimator on this data improves accuracy on both real-world and mocap data by evaluating on the 3DPW and HumanEVA datasets.

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

The first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image is described, showing superior pose accuracy with respect to the state of the art.