LASOR: Learning Accurate 3D Human Pose and Shape via Synthetic Occlusion-Aware Data and Neural Mesh Rendering
@article{Yang2021LASORLA, title={LASOR: Learning Accurate 3D Human Pose and Shape via Synthetic Occlusion-Aware Data and Neural Mesh Rendering}, author={Kaibing Yang and Renshu Gu and Maoyu Wang and Masahiro Toyoura and Gang Xu}, journal={IEEE Transactions on Image Processing}, year={2021}, volume={31}, pages={1938-1948} }
A key challenge in the task of human pose and shape estimation is occlusion, including self-occlusions, object-human occlusions, and inter-person occlusions. The lack of diverse and accurate pose and shape training data becomes a major bottleneck, especially for scenes with occlusions in the wild. In this paper, we focus on the estimation of human pose and shape in the case of inter-person occlusions, while also handling object-human occlusions and self-occlusion. We propose a novel framework…
Figures and Tables from this paper
3 Citations
Occluded Human Body Capture with Self-Supervised Spatial-Temporal Motion Prior
- Computer ScienceArXiv
- 2022
Experimental results show that the key-idea is to employ non-occluded human data to learn a joint-level spatial-temporal motion prior for occluded human with a self-supervised strategy, which can generate accurate and coherent human motions with good generalization ability and runtime efficiency.
A Progressive Quadric Graph Convolutional Network for 3D Human Mesh Recovery
- Computer ScienceIEEE Transactions on Circuits and Systems for Video Technology
- 2023
A Progressive Quadric Graph Convolutional Network (PQ-GCN) is proposed, and a simple and fast method for 3D human mesh recovery from a single image in the wild is designed, using 66% fewer parameters than the existing method, Pose2Mesh.
PLIKS: A Pseudo-Linear Inverse Kinematic Solver for 3D Human Body Estimation
- Computer ScienceArXiv
- 2022
We consider the problem of reconstructing a 3D mesh of the human body from a single 2D image as a model-in-the-loop optimization problem. Existing approaches often regress the shape, pose, and…
References
SHOWING 1-10 OF 45 REFERENCES
Object-Occluded Human Shape and Pose Estimation From a Single Color Image
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
This paper proposes a novel two-branch network architecture to train an end-to-end regressor via the latent feature supervision, which also includes a novel saliency map sub-net to extract the human information from object-occluded color images.
Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild
- Computer ScienceBMVC
- 2020
STRAPS (Synthetic Training for Real Accurate Pose and Shape), a system that utilises proxy representations, such as silhouettes and 2D joints, as inputs to a shape and pose regression neural network, which is trained with synthetic training data (generated on-the-fly during training using the SMPL statistical body model) to overcome data scarcity.
3DCrowdNet: 2D Human Pose-Guided3D Crowd Human Pose and Shape Estimation in the Wild
- Computer ScienceArXiv
- 2021
3DCrowdNet, a 2D human pose-guided 3D crowd pose and shape estimation system for in-the-wild scenes that designs its system to leverage the robust 2D pose outputs from off- the-shelf2D pose estimators, which guide a network to focus on a target person and provide essential human articulation information.
DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
A novel end-to-end framework for jointly estimating 3D human pose and body shape from a monocular RGB image and a large-scale synthetic dataset utilizing web-crawled Mocap sequences, 3D scans and animations is constructed.
Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision
- Computer Science2017 International Conference on 3D Vision (3DV)
- 2017
We propose a CNN-based approach for 3D human body pose estimation from single RGB images that addresses the issue of limited generalizability of models trained solely on the starkly limited publicly…
Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes: The Importance of Multiple Scene Constraints
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This paper leverage state-of-the-art deep multi-task neural networks and parametric human and scene modeling, towards a fully automatic monocular visual sensing system for multiple interacting people, which infers the 2d and 3d pose and shape of multiple people from a single image.
Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution
- Computer Science2020 25th International Conference on Pattern Recognition (ICPR)
- 2021
This paper proposes a temporal regression network with a gated convolution module to transform 2D joints to 3D and recover the missing occluded joints in the meantime and shows that the proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods, especially for the scenarios with heavy occlusions.
Learning to Estimate 3D Human Pose and Shape from a Single Color Image
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work addresses the problem of estimating the full body 3D human pose and shape from a single color image and proposes an efficient and effective direct prediction method based on ConvNets, incorporating a parametric statistical body shape model (SMPL) within an end-to-end framework.
End-to-End Recovery of Human Shape and Pose
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work introduces an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes, and produces a richer and more useful mesh representation that is parameterized by shape and 3D joint angles.
Estimating Human Pose from Occluded Images
- Computer ScienceACCV
- 2009
Experimental results on synthetic and real data sets bear out the theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.