• Corpus ID: 160017424

Learning Pose Grammar for Monocular 3 D Pose Estimation

  title={Learning Pose Grammar for Monocular 3 D Pose Estimation},
  author={Yuanlu Xu and Wenguan Wang and Xiaobai Liu and Jianwen Xie and Jianbing Shen and Song-Chun Zhu},
In this paper, we propose a pose grammar to tackle the problem of 3D human pose estimation from a monocular RGB image. Our model takes estimated 2D pose as the input and learns a generalized 2D-3D mapping function to leverage into 3D pose. The proposed model consists of a base network which efficiently captures pose-aligned features and a hierarchy of Bi-directional RNNs (BRNNs) on the top to explicitly incorporate a set of knowledge regarding human body configuration (i.e., kinematics… 

Figures and Tables from this paper

A Review of 3D Human Pose Estimation from 2D Images
An overview of the classic and deep learning-based 3D pose estimation approaches is provided, point out relevant evaluation metrics, pose parametrizations, body models, and 3D human pose datasets.
Lifting 2D Human Pose to 3D with Domain Adapted 3D Body Concept
The proposed framework unifies the supervised and semi-supervised 3D pose estimation in a principled framework and validated that the explicitly learned 3D body concept effectively alleviates the 2D-3D ambiguity in 2D pose lifting, improves the generalization, and enables the network to exploit the abundant unlabeled 2D data.


Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation
This paper proposes a pose grammar to tackle the problem of 3D human pose estimation, which takes 2D pose as input and learns a generalized 2D-3D mapping function and enforces high-level constraints over human poses.
3D Human Pose Estimation from a Single Image via Distance Matrix Regression
  • F. Moreno-Noguer
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
It is shown that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using NxN distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression.
Pose-conditioned joint angle limits for 3D human pose reconstruction
A general parametrization of body pose is defined and a new, multi-stage, method to estimate 3D pose from 2D joint locations using an over-complete dictionary of poses is defined that shows good generalization while avoiding impossible poses.
MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild
This paper introduces an image-based synthesis engine that artificially augments a dataset of real images with 2D human pose annotations using 3D Motion Capture (MoCap) data to generate a large set of photorealistic synthetic images of humans with 3D pose annotations.
Monocular 3D Human Pose Estimation by Predicting Depth on Joints
The empirical e-valuation on Human3.6M and HHOI dataset demonstrates the advantage of combining global 2D skeleton and local image patches for depth prediction, and the superior quantitative and qualitative performance relative to state-of-the-art methods.
Robust Estimation of 3D Human Poses from a Single Image
This work proposes a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose/joint detector, which outperforms the state-of-the-arts on three benchmark datasets.
A Simple Yet Effective Baseline for 3d Human Pose Estimation
The results indicate that a large portion of the error of modern deep 3d pose estimation systems stems from their visual analysis, and suggests directions to further advance the state of the art in 3d human pose estimation.
Exploiting Temporal Information for 3D Human Pose Estimation
A sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training is designed, which helps the network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.
Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video
This paper addresses the challenge of 3D full-body human pose estimation from a monocular image sequence with a novel approach that integrates a sparsity-driven 3D geometric prior and temporal smoothness and outperforms a publicly available 2D pose estimation baseline on the challenging PennAction dataset.
3D Human Pose Estimation in the Wild by Adversarial Learning
An adversarial learning framework is proposed, which distills the 3D human pose structures learned from the fully annotated dataset to in-the-wild images with only 2D pose annotations and designs a geometric descriptor, which computes the pairwise relative locations and distances between body joints, as a new information source for the discriminator.