Corpus ID: 236428458

Hand Image Understanding via Deep Multi-Task Learning

  title={Hand Image Understanding via Deep Multi-Task Learning},
  author={Xiong Zhang and Hongsheng Huang and Jianchao Tan and Hongmin Xu and Cheng Yang and Guozhu Peng and Lei Wang and Ji Liu},
Analyzing and understanding hand information from multimedia materials like images or videos is important for many real world applications and remains active in research community. There are various works focusing on recovering hand information from single image, however, they usually solve a single task, for example, hand mask segmentation, 2D/3D hand pose estimation, or hand mesh reconstruction and perform not well in challenging scenarios. To further improve the performance of these tasks… Expand


Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs
This work proposes to first project the query depth image onto three orthogonal planes and utilize these multi-view projections to regress for 2D heat-maps which estimate the joint positions on each plane to produce final 3D hand pose estimation with learned pose priors. Expand
3D Hand Shape and Pose Estimation From a Single RGB Image
This work proposes a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose and proposes a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Expand
Cross-Modal Deep Variational Hand Pose Estimation
This work proposes a method to learn a statistical hand model represented by a cross-modal trained latent space via a generative deep neural network, which can be directly used to estimate 3D hand poses from RGB images, outperforming the state-of-the art in different settings. Expand
3D Hand Shape and Pose From Images in the Wild
This work presents the first end-to-end deep learning based method that predicts both 3D hand shape and pose from RGB images in the wild, consisting of the concatenation of a deep convolutional encoder, and a fixed model-based decoder. Expand
Disentangling Latent Hands for Image Synthesis and Pose Estimation
  • Linlin Yang, Angela Yao
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
Experiments show that the dVAE can synthesize highly realistic images of the hand specifiable by both pose and image background content and also estimate 3D hand poses from RGB images with accuracy competitive with state-of-the-art on two public benchmarks. Expand
Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects
This work proposes a novel end-to-end trainable pipeline that adapts the hand-object domain to the single hand- only domain, while learning for HPE, and significantly outperforms state-of-the-arts trained by hand-only data and is comparable to those supervised by HOI data. Expand
Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
A novel graph-based method to tackle the problem of 3D human body and 3D hand pose estimation from a short sequence of 2D joint detections, where domain knowledge about the human hand (body) configurations is explicitly incorporated into the graph convolutional operations to meet the specific demand of the 3D pose estimation. Expand
HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation From a Single Depth Map
This work proposes a novel architecture with 3D convolutions trained in a weakly-supervised manner that produces visually more reasonable and realistic hand shapes on NYU and BigHand2.2M datasets compared to the existing approaches. Expand
End-to-End Hand Mesh Recovery From a Monocular RGB Image
Qualitative experiments show that the HAMR framework is capable of recovering appealing 3D hand mesh even in the presence of severe occlusions, and outperforms the state-of-the-art methods for both 2D and3D hand pose estimation from a monocular RGB image on several benchmark datasets. Expand
Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild
We introduce a simple and effective network architecture for monocular 3D hand pose estimation consisting of an image encoder followed by a mesh convolutional decoder that is trained through a directExpand