Corpus ID: 220363956

Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

@article{Zhang2020InferenceSO,
  title={Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation},
  author={Jianfeng Zhang and Xuecheng Nie and Jiashi Feng},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.02054}
}
Existing 3D human pose estimation models suffer performance drop when applying to new scenarios with unseen poses due to their limited generalizability. In this work, we propose a novel framework, Inference Stage Optimization (ISO), for improving the generalizability of 3D pose models when source and target data come from different pose distributions. Our main insight is that the target data, even though not labeled, carry valuable priors about their underlying distribution. To exploit such… Expand
Part-aware Measurement for Robust Multi-View Multi-Human 3D Pose Estimation and Tracking
TLDR
This paper introduces an approach for multi-human 3D pose estimation and tracking based on calibrated multiview that takes advantage of temporal consistency to match the 2D poses estimated with previously constructed 3D skeletons in every view. Expand
Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
TLDR
A new training algorithm named Bilevel Online Adaptation (BOA) is proposed, which divides the optimization process of overall multi-objective into two steps of weight probe and weight update in a training iteration, and leads to state-of-the-art results on two human mesh reconstruction benchmarks. Expand
Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
TLDR
Adversarial Self-Supervised Learning (ASSL) is presented, a novel framework that tightly couples SSL and the semi-supervised scheme via neighbor relation exploration and adversarial learning to improve the discrimination capability of learned representations for 3D action recognition. Expand
DAG amendment for inverse control of parametric shapes
Parametric shapes model objects as programs producing a geometry based on a few semantic degrees of freedom, called hyper-parameters. These shapes are the typical output of non-destructive modeling,Expand
Body Meshes as Points
TLDR
A singlestage model that represents multiple person instances as points in the spatial-depth space where each point is associated with one body mesh and can directly predict body meshes for multiple persons in a single stage, to simplify the pipeline and lift both efficiency and performance. Expand
PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
TLDR
PoseAug, a new autoaugmentation framework that learns to augment the available training poses towards a greater diversity and thus improve generalization of the trained 2D-to-3D pose estimator, is presented. Expand

References

SHOWING 1-10 OF 64 REFERENCES
Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation
TLDR
This paper proposes a pose grammar to tackle the problem of 3D human pose estimation, which takes 2D pose as input and learns a generalized 2D-3D mapping function and enforces high-level constraints over human poses. Expand
Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation
TLDR
A geometry-aware 3D representation for the human pose is proposed to address this limitation by using multiple views in a simple auto-encoder model at the training stage and only 2D keypoint information as supervision, and injecting the representation as a robust 3D prior. Expand
Sim2real transfer learning for 3D human pose estimation: motion to the rescue
TLDR
This paper shows that standard neural-network approaches, which perform poorly when trained on synthetic RGB images, can perform well when the data is pre-processed to extract cues about the person’s motion, notably as optical flow and the motion of 2D keypoints. Expand
Generalizing Monocular 3D Human Pose Estimation in the Wild
  • Luyang Wang, Yan Chen, +4 authors Jimmy S. J. Ren
  • Computer Science
  • 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)
  • 2019
TLDR
This paper proposes a principled approach to generate high quality 3D pose ground truth given any in-the-wild image with a person inside, and builds a large-scale dataset, which enables the training of a high quality neural network model, without specialized training scheme and auxiliary loss function, which performs favorably against the state-of theart3D pose estimation methods. Expand
Learning 3D Human Pose from Structure and Motion
TLDR
This work proposes two anatomically inspired loss functions and uses them with a weakly-supervised learning framework to jointly learn from large-scale in-the-wild 2D and indoor/synthetic 3D data and presents a simple temporal network that exploits temporal and structural cues present in predicted pose sequences to temporally harmonize the pose estimations. Expand
Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking
TLDR
A Deep Conditional Variational Autoencoder based model that synthesizes diverse anatomically plausible 3D-pose samples conditioned on the estimated 2D- pose is proposed, and it is shown that CVAE-based 3d-pose sample set is consistent with the 2D to 3D lifting and helps tackling the inherent ambiguity in2D-to-3D lifting. Expand
Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation
TLDR
The transform re-projection loss that is an effective way to explore multi-view consistency for training the 2Dto-3D lifting network and the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Expand
Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach
TLDR
A weakly-supervised transfer learning method that uses mixed 2D and 3D labels in a unified deep neutral network that presents two-stage cascaded structure to regularize the 3D pose prediction, which is effective in the absence of ground truth depth labels. Expand
PoseFix: Model-Agnostic General Human Pose Refinement Network
TLDR
This paper proposes a human pose refinement network that estimates a refined pose from a tuple of an input image and input pose and shows that the proposed approach achieves better performance than the conventional multi-stage refinement models and consistently improves the performance of various state-of-the-art pose estimation methods on the commonly used benchmark. Expand
Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
TLDR
A novel graph-based method to tackle the problem of 3D human body and 3D hand pose estimation from a short sequence of 2D joint detections, where domain knowledge about the human hand (body) configurations is explicitly incorporated into the graph convolutional operations to meet the specific demand of the 3D pose estimation. Expand
...
1
2
3
4
5
...