Model-based 3D Hand Reconstruction via Self-Supervised Learning
@article{Chen2021Modelbased3H, title={Model-based 3D Hand Reconstruction via Self-Supervised Learning}, author={Yujin Chen and Zhigang Tu and Di Kang and Linchao Bao and Ying Zhang and Xuefei Zhe and Ruizhi Chen and Junsong Yuan}, journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2021}, pages={10446-10455} }
Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity. To reliably reconstruct a 3D hand from a monocular image, most state-of-the-art methods heavily rely on 3D annotations at the training stage, but obtaining 3D annotations is expensive. To alleviate reliance on labeled training data, we propose S2HAND, a self-supervised 3D hand reconstruction network that can jointly estimate pose, shape, texture, and the camera viewpoint…
Figures and Tables from this paper
13 Citations
Consistent 3D Hand Reconstruction in Video via self-supervised Learning
- Computer ScienceArXiv
- 2022
S2HAND is proposed, a self-supervised 3D hand reconstruction model that can jointly estimate pose, shape, texture, and the camera viewpoint from a single RGB input through the supervision of easily accessible 2D detected keypoints.
End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image
- Computer ScienceArXiv
- 2022
This paper designs a multi-head auto- encoder structure for multi-hand reconstruction, where each head network shares the same feature map and outputs the hand center, pose and texture, respectively, and adopts a weakly-supervised scheme to alleviate the burden of expensive 3D real-world data annotations.
Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey
- Computer ScienceArXiv
- 2022
This survey presents comprehensive analysis of 3D hand pose estimation from the perspective of efficient annotation and learning, and investigates annotation methods classified as manual, synthetic-model-based, hand-sensor- based, and computational approaches.
Joint Hand-Object 3D Reconstruction From a Single Image With Cross-Branch Feature Fusion
- Computer ScienceIEEE Transactions on Image Processing
- 2021
This work proposes to consider hand and object jointly in feature space and explore the reciprocity of the two branches in cross-branch feature fusion architectures with MLP or LSTM units, and significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
HandTailor: Towards High-Precision Monocular 3D Hand Recovery
- Computer ScienceArXiv
- 2021
This work introduces a novel framework HandTailor, which combines a learning-based hand module and an optimization-based tailor module to achieve high-precision hand mesh recovery from a monocular RGB image.
Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing
- Computer ScienceArXiv
- 2021
An image-based refinement is achieved through differentiable ray tracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past.
Semi-Supervised 3D Hand Shape and Pose Estimation with Label Propagation
- Computer Science2021 Digital Image Computing: Techniques and Applications (DICTA)
- 2021
The Pose Alignment network is proposed to propagate 3D annotations from labelled frames to nearby unlabelled frames in sparsely annotated videos to improve the pose estimation accuracy and incorporate the alignment supervision on pairs of labelled-unlabelled frames.
Local and Global Point Cloud Reconstruction for 3D Hand Pose Estimation
- Computer Science, Environmental ScienceArXiv
- 2021
This paper presents a novel pipeline for local and global point cloud reconstruction using a 3D hand template while learning a latent representation for pose estimation, and introduces a new multi-view hand posture dataset to obtain complete 3D point clouds of the hand in the real world.
InterNet+: A Light Network for Hand Pose Estimation
- Computer ScienceSensors
- 2021
A feature extractor is redesigned that introduced the latest achievements in the field of computer vision, such as the ACON activation function and the new attention mechanism module, etc, which can better extract global features from an RGB image of the hand, leading to a greater performance improvement compared to InterNet and other similar networks.
3D interacting hand pose and shape estimation from a single RGB image
- Computer ScienceNeurocomputing
- 2022
References
SHOWING 1-10 OF 55 REFERENCES
3D Hand Shape and Pose Estimation From a Single RGB Image
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work proposes a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose and proposes a weakly-supervised approach by leveraging the depth map as a weak supervision in training.
3D Hand Shape and Pose From Images in the Wild
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work presents the first end-to-end deep learning based method that predicts both 3D hand shape and pose from RGB images in the wild, consisting of the concatenation of a deep convolutional encoder, and a fixed model-based decoder.
Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
- Computer ScienceECCV
- 2018
A weakly-supervised method, adaptating from fully-annotated synthetic dataset toWeakly-labeled real-world dataset with the aid of a depth regularizer, which generates depth maps from predicted 3D pose and serves as weak supervision for3D pose regression.
Shape and Viewpoint without Keypoints
- Computer ScienceECCV
- 2020
We present a learning framework that learns to recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera…
HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization
- Computer ScienceECCV
- 2020
3D hand reconstruction from images is a widely-studied problem in computer vision and graphics, and has a particularly high relevance for virtual and augmented reality. Although several 3D hand…
End-to-End Hand Mesh Recovery From a Monocular RGB Image
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
Qualitative experiments show that the HAMR framework is capable of recovering appealing 3D hand mesh even in the presence of severe occlusions, and outperforms the state-of-the-art methods for both 2D and3D hand pose estimation from a monocular RGB image on several benchmark datasets.
Self-Supervised Learning of Detailed 3D Face Reconstruction
- Computer ScienceIEEE Transactions on Image Processing
- 2020
An end-to-end learning framework for detailed 3D face reconstruction from a single image that combines a photometric loss and a facial perceptual loss between the input face and the rendered face, and a displacement map in UV-space to represent a3D face.
Learning to Estimate 3D Hand Pose from Single RGB Images
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
A deep network is proposed that learns a network-implicit 3D articulation prior that yields good estimates of the 3D pose from regular RGB images, and a large scale 3D hand pose dataset based on synthetic hand models is introduced.
Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This first approach that jointly learns a regressor for face shape, expression, reflectance and illumination on the basis of a concurrently learned parametric face model is presented, which compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.
Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This work proposes to first project the query depth image onto three orthogonal planes and utilize these multi-view projections to regress for 2D heat-maps which estimate the joint positions on each plane to produce final 3D hand pose estimation with learned pose priors.