XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera

@article{Mehta2020XNectRM,
  title={XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera},
  author={Dushyant Mehta and Oleksandr Sotnychenko and Franziska Mueller and Weipeng Xu and Mohamed A. Elgharib and P. Fua and H. Seidel and Helge Rhodin and Gerard Pons-Moll and C. Theobalt},
  journal={ACM Trans. Graph.},
  year={2020},
  volume={39},
  pages={82}
}
We present a real-time approach for multi-person 3D motion capture at over 30 fps using a single RGB camera. [...] Key Method The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals.Expand
Multi-Person 3D Human Pose Estimation from Monocular Images
TLDR
HG-RCNN, a Mask-RCnn based network that also leverages the benefits of the Hourglass architecture for multi-person 3D Human Pose Estimation, achieves the state-of-the-art results on MuPoTS-3D while also approximating the 3D pose in the camera-coordinate system. Expand
SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
TLDR
A novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2. Expand
Binocular Multi-CNN System for Real-Time 3D Pose Estimation
TLDR
The first open-source algorithm for binocular 3D pose estimation uses two separate lightweight CNNs to estimate disparity/depth information from a stereoscopic camera input to perform full-depth sensing in real time on a consumer-grade laptop. Expand
A Review of 3D Human Pose Estimation from 2D Images
TLDR
An overview of the classic and deep learning-based 3D pose estimation approaches is provided, point out relevant evaluation metrics, pose parametrizations, body models, and 3D human pose datasets. Expand
PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
TLDR
PandaNet surpasses previous single-shot methods on several challenging datasets: a multi-person urban virtual but very realistic dataset (JTA Dataset), and two real world 3D multi- person datasets (CMU Panoptic and MuPoTS-3D). Expand
AnimePose: Multi-person 3D pose estimation and animation
TLDR
A trivial yet effective solution to generate 3D animation of multiple persons from a 2D video using deep learning and a supervised multi-person 3D pose estimation and animation framework namely AnimePose for a given input RGB video sequence. Expand
PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation
TLDR
The pose interacting network, or PI-Net, inputs the initial pose estimates of a variable number of interactees into a recurrent architecture used to refine the pose of the person-of-interest. Expand
SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera
We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device. This unusualExpand
VIBE: Video Inference for Human Body Pose and Shape Estimation
TLDR
This work defines a novel temporal network architecture with a self-attention mechanism and shows that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. Expand
Reconstructing 3D Human Pose by Watching Humans in the Mirror
TLDR
An optimization-based approach is developed that exploits mirror symmetry constraints for accurate 3D pose reconstruction and provides a method to estimate the surface normal of the mirror from vanishing points in the single image. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 124 REFERENCES
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
TLDR
This work presents the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera and shows that the approach is more broadly applicable than RGB-D solutions, i.e., it works for outdoor scenes, community videos, and low quality commodity RGB cameras. Expand
Single-Shot Multi-person 3D Pose Estimation from Monocular RGB
We propose a new single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose-maps (ORPM) which enable full bodyExpand
Multi-Person 3D Human Pose Estimation from Monocular Images
TLDR
HG-RCNN, a Mask-RCnn based network that also leverages the benefits of the Hourglass architecture for multi-person 3D Human Pose Estimation, achieves the state-of-the-art results on MuPoTS-3D while also approximating the 3D pose in the camera-coordinate system. Expand
LCR-Net++: Multi-Person 2D and 3D Pose Detection in Natural Images
TLDR
The approach significantly outperforms the state of the art in 3D pose estimation on Human3.6M, a controlled environment, and shows promising results on real images for both single and multi- person subsets of the MPII 2D pose benchmark and demonstrates satisfying3D pose results even for multi-person images. Expand
Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision
We propose a CNN-based approach for 3D human body pose estimation from single RGB images that addresses the issue of limited generalizability of models trained solely on the starkly limited publiclyExpand
Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes: The Importance of Multiple Scene Constraints
TLDR
This paper leverage state-of-the-art deep multi-task neural networks and parametric human and scene modeling, towards a fully automatic monocular visual sensing system for multiple interacting people, which infers the 2d and 3d pose and shape of multiple people from a single image. Expand
ArtTrack: Articulated Multi-Person Tracking in the Wild
TLDR
This paper uses a model that resembles existing architectures for single-frame pose estimation but is substantially faster to generate proposals for body joint locations and forms articulated tracking as spatio-temporal grouping of such proposals. Expand
Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera
TLDR
This work proposes the first real-time system for the egocentric estimation of 3D human body pose in a wide range of unconstrained everyday activities and achieves lower 3D joint error as well as better 2D overlay than the existing baselines. Expand
Towards Accurate Marker-Less Human Shape and Pose Estimation over Time
  • Yinghao Huang
  • Computer Science
  • 2017 International Conference on 3D Vision (3DV)
  • 2017
TLDR
This work presents a fully automatic method that, given multi-view videos, estimates 3D human pose and body shape and takes the recently proposed SMPLify method as the base method and extends it in several ways. Expand
Monocular Total Capture: Posing Face, Body, and Hands in the Wild
TLDR
This work presents the first method to capture the 3D total motion of a target person from a monocular view input, and leverages a 3D deformable human model to reconstruct total body pose from the CNN outputs with the aid of the pose and shape prior in the model. Expand
...
1
2
3
4
5
...