Dual Grid Net: hand mesh vertex regression from single depth maps

  title={Dual Grid Net: hand mesh vertex regression from single depth maps},
  author={Chengde Wan and Thomas Probst and Luc Van Gool and Angela Yao},
We present a method for recovering the dense 3D surface of the hand by regressing the vertex coordinates of a mesh model from a single depth map. [] Key Method In the first stage, the network estimates a dense correspondence field for every pixel on the depth map or image grid to the mesh grid. In the second stage, we design a differentiable operator to map features learned from the previous stage and regress a 3D coordinate map on the mesh grid.

End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image

This paper designs a multi-head auto- encoder structure for multi-hand reconstruction, where each head network shares the same feature map and outputs the hand center, pose and texture, respectively, and adopts a weakly-supervised scheme to alleviate the burden of expensive 3D real-world data annotations.

Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction

The quality of the results outperforms the state-of-the-art methods on hand-mesh/pose precision and hand-image alignment and several real-time AR scenarios are showcased.

Local and Global Point Cloud Reconstruction for 3D Hand Pose Estimation

This paper presents a novel pipeline for local and global point cloud reconstruction using a 3D hand template while learning a latent representation for pose estimation, and shows that the proposed method achieves comparable or better performance than existing3D hand pose and shape estimation methods.

Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

This paper tries to manipulate the image features of the hand pose estimation network by adding the features from text descrip-tions using the CLIP (Contrastive Language-Image Pre-training) model, and shows improved performance over the state-of-the-art domain generalization approaches.

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

This work introduces a novel framework HandTailor, which combines a learning-based hand module and an optimization-based tailor module to achieve high-precision hand mesh recovery from a monocular RGB image.

A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

The differentiable nature of HALO is shown to improve the quality of the synthesized hands both in terms of physical plausibility and user preference and it can be trained end-to-end, allowing the formulation of losses on the hand surface that benefit the learning of 3D keypoints.

Automatic and Fast Extraction of 3D Hand Measurements using a Deep Neural Network

This paper proposes the first deep neural network for automatic hand measurement extraction from a single 3D scan (H-Net), following an encoder-decoder architecture design, taking a point cloud of the hand as input and outputting the reconstructed hand mesh as well as the corresponding measurement values.

Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

An effective approach to addressing this challenge by exploiting 3D geometric constraints within a cycle generative adversarial network (CycleGAN) to perform domain adaptation and proposing to enforce short and long-term temporal consistency to fine-tune the domain-adapted model in a self-supervised fashion is introduced.

A Deep Learning Approach to Automatically Extract 3D Hand Measurements

This paper proposes the first deep-learning-based method to automatically measure the hand in a non-contact manner from a single 3D hand scan, and demonstrates that the proposed method outperforms the state-of-the-art method in most hand measurement types.

UV-Based 3D Hand-Object Reconstruction with Grasp Optimization

This work proposes a novel framework for 3D hand shape reconstruction and hand-object grasp optimization from a single RGB image using a dense representation in the form of a UV coordinate map and introduces inference-time optimization to improve interactions between the hand and the object.



End-to-End Hand Mesh Recovery From a Monocular RGB Image

Qualitative experiments show that the HAMR framework is capable of recovering appealing 3D hand mesh even in the presence of severe occlusions, and outperforms the state-of-the-art methods for both 2D and3D hand pose estimation from a monocular RGB image on several benchmark datasets.

DeepHPS: End-to-end Estimation of 3D Hand Pose and Shape by Learning from Synthetic Depth

A fully supervised deep network is proposed which learns to jointly estimate a full 3D hand mesh representation and pose from a single depth image to improve model based learning (hybrid) methods' results on two of the public benchmarks.

3D Hand Shape and Pose Estimation From a Single RGB Image

This work proposes a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose and proposes a weakly-supervised approach by leveraging the depth map as a weak supervision in training.

Dense Human Body Correspondences Using Convolutional Networks

This work uses a deep convolutional neural network to train a feature descriptor on depth map pixels, but crucially, rather than training the network to solve the shape correspondence problem directly, it trains it to solve a body region classification problem, modified to increase the smoothness of the learned descriptors near region boundaries.

V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map

This model is designed as a 3D CNN that provides accurate estimates while running in real-time and outperforms previous methods in almost all publicly available 3D hand and human pose estimation datasets and placed first in the HANDS 2017 frame-based3D hand pose estimation challenge.

User-Specific Hand Modeling from Monocular Depth Sequences

An objective is proposed that measures the error of fit between each sampled data point and a continuous model surface defined by a rigged control mesh, and uses as-rigid-as-possible (ARAP) regularizers to cleanly separate the model and template geometries.

DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

The proposed system, called DenseReg, allows us to estimate dense image-to-template correspondences in a fully convolutional manner and can provide useful correspondence information as a stand-alone system, while when used as an initialization for Statistical Deformable Models the authors obtain landmark localization results that largely outperform the current state-of-the-art on the challenging 300W benchmark.

3D Hand Shape and Pose From Images in the Wild

This work presents the first end-to-end deep learning based method that predicts both 3D hand shape and pose from RGB images in the wild, consisting of the concatenation of a deep convolutional encoder, and a fixed model-based decoder.

Dense 3D Regression for Hand Pose Estimation

A simple and effective method for 3D hand pose estimation from a single depth frame based on dense pixel-wise estimation that outperforms all previous state-of-the-art approaches by a large margin and outperforms various other proposed methods.

End-to-End Recovery of Human Shape and Pose

This work introduces an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes, and produces a richer and more useful mesh representation that is parameterized by shape and 3D joint angles.