Full-Body Awareness from Partial Observations

  title={Full-Body Awareness from Partial Observations},
  author={C. Rockwell and David F. Fouhey},
There has been great progress in human 3D mesh recovery and great interest in learning about the world from consumer video data. Unfortunately current methods for 3D human mesh recovery work rather poorly on consumer video data, since on the Internet, unusual camera viewpoints and aggressive truncations are the norm rather than a rarity. We study this problem and make a number of contributions to address it: (i) we propose a simple but highly effective self-training framework that adapts human… Expand
Human Mesh Recovery from Multiple Shots
An insight that while shot changes of the same scene incur a discontinuity between frames, the 3D structure of the scene still changes smoothly is addressed, which allows us to handle frames before and after the shot change as multi-view signal that provide strong cues to recover the3D state of the actors. Expand
4D Human Body Capture from Egocentric Video via 3D Scene Grounding
This work proposes a novel optimization-based approach that leverages 2D observations of the entire video sequence and human-scene interaction constraint to estimate second-person human poses, shapes and global motion that are grounded on the 3D environment captured from the egocentric view. Expand
SPEC: Seeing People in the Wild with an Estimated Camera
In this supplementary document, we provide more information that is not covered in the main text, ranging from technical details in the method, visual examples of the proposed datasets, moreExpand
HuMoR: 3D Human Motion Model for Robust Pose Estimation
An expressive generative model in the form of a conditional variational autoencoder, which learns a distribution of the change in pose at each step of a motion sequence, which generalizes to diverse motions and body shapes after training on a large motion capture dataset and enables motion reconstruction from multiple input modalities. Expand
Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos
  • Di Yang, Rui Dai, +4 authors F. Brémond
  • Computer Science
  • 2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
  • 2021
A Selective Spatio-Temporal Aggregation mechanism that refines and smooths the keypoint locations extracted by multiple expert pose estimators, and an effective weakly-supervised self-training framework which leverages the aggregated poses as pseudo ground-truth in-stead of handcrafted annotations for real-world pose estimation are proposed. Expand
PARE: Part Attention Regressor for 3D Human Body Estimation
The Supplementary Material consists of this document and a video. They include acknowledgement, disclosure, additional information and visualizations of our method and results. Acknowledgements: WeExpand
Body Meshes as Points
A singlestage model that represents multiple person instances as points in the spatial-depth space where each point is associated with one body mesh and can directly predict body meshes for multiple persons in a single stage, to simplify the pipeline and lift both efficiency and performance. Expand
LoBSTr: Real‐time Lower‐body Pose Prediction from Sparse Upper‐body Tracking Signals
This paper introduces a deep neural network (DNN) based method for real‐time prediction of the lower‐body pose only from the tracking signals of the upper‐body joints, and demonstrates the effectiveness of the method through several quantitative evaluations against other architectures and input representations with respect to wild tracking data obtained from commercial VR devices. Expand


End-to-End Recovery of Human Shape and Pose
This work introduces an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes, and produces a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. Expand
Learning to Estimate 3D Human Pose and Shape from a Single Color Image
This work addresses the problem of estimating the full body 3D human pose and shape from a single color image and proposes an efficient and effective direct prediction method based on ConvNets, incorporating a parametric statistical body shape model (SMPL) within an end-to-end framework. Expand
Convolutional Mesh Regression for Single-Image Human Shape Reconstruction
This paper addresses the problem of 3D human pose and shape estimation from a single image by proposing a graph-based mesh regression, which outperform the comparable baselines relying on model parameter regression, and achieves state-of-the-art results among model-based pose estimation approaches. Expand
2D Human Pose Estimation: New Benchmark and State of the Art Analysis
A novel benchmark "MPII Human Pose" is introduced that makes a significant advance in terms of diversity and difficulty, a contribution that is required for future developments in human body models. Expand
Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision
We propose a CNN-based approach for 3D human body pose estimation from single RGB images that addresses the issue of limited generalizability of models trained solely on the starkly limited publiclyExpand
Unite the People: Closing the Loop Between 3D and 2D Human Representations
This work proposes a hybrid approach to 3D body model fits for multiple human pose datasets with an extended version of the recently introduced SMPLify method, and shows that UP-3D can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale. Expand
Monocular Total Capture: Posing Face, Body, and Hands in the Wild
This work presents the first method to capture the 3D total motion of a target person from a monocular view input, and leverages a 3D deformable human model to reconstruct total body pose from the CNN outputs with the aid of the pose and shape prior in the model. Expand
Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera
This work proposes the first real-time system for the egocentric estimation of 3D human body pose in a wide range of unconstrained everyday activities and achieves lower 3D joint error as well as better 2D overlay than the existing baselines. Expand
Self-supervised Learning of Motion Capture
This work proposes a learning based motion capture model that optimizes neural network weights that predict 3D shape and skeleton configurations given a monocular RGB video and shows that the proposed model improves with experience and converges to low-error solutions where previous optimization methods fail. Expand
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
OpenPose is released, the first open-source realtime system for multi-person 2D pose detection, including body, foot, hand, and facial keypoints, and the first combined body and foot keypoint detector, based on an internal annotated foot dataset. Expand