Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation

  title={Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation},
  author={John Yang and Yash Bhalgat and Simyung Chang and Fatih Murat Porikli and Nojun Kwak},
  journal={2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  • John YangYash Bhalgat Nojun Kwak
  • Published 11 November 2021
  • Computer Science
  • 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
While hand pose estimation is a critical component of most interactive extended reality and gesture recognition systems, contemporary approaches are not optimized for computational and memory efficiency. In this paper, we propose a tiny deep neural network of which partial layers are recursively exploited for refining its previous estimations. During its iterative refinements, we employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample… 

Figures and Tables from this paper

Interacting Hand-Object Pose Estimation via Dense Mutual Attention

This work proposes a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object and outperforms state-of-the-art methods.



Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation

A novel model, called Adaptive Computationally Efficient (ACE) network, is proposed, which takes advantage of a Gaussian kernel based Gate Module to dynamically switch the computation between a light model and a heavy network for feature extraction.

SeqHAND: RGB-Sequence-Based 3D Hand Pose and Shape Estimation

This paper proposes a novel method that generates a synthetic dataset that mimics natural human hand movements by re-engineering annotations of an extant static hand pose dataset into pose-flows and shows that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations by outperforming state-of-the-art methods in experiments on hand poses estimation benchmarks.

Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks

A novel graph-based method to tackle the problem of 3D human body and 3D hand pose estimation from a short sequence of 2D joint detections, where domain knowledge about the human hand (body) configurations is explicitly incorporated into the graph convolutional operations to meet the specific demand of the 3D pose estimation.

Hand Pose Estimation via Latent 2.5D Heatmap Regression

This paper proposes a new method for 3D hand pose estimation from a monocular image through a novel 2.5D pose representation that implicitly learns depth maps and heatmap distributions with a novel CNN architecture.

Cross-Modal Deep Variational Hand Pose Estimation

This work proposes a method to learn a statistical hand model represented by a cross-modal trained latent space via a generative deep neural network, which can be directly used to estimate 3D hand poses from RGB images, outperforming the state-of-the art in different settings.

3D Hand Shape and Pose From Images in the Wild

This work presents the first end-to-end deep learning based method that predicts both 3D hand shape and pose from RGB images in the wild, consisting of the concatenation of a deep convolutional encoder, and a fixed model-based decoder.

Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering

Experiments using three RGB-based benchmarks show that the framework offers beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand shapes.

Feature Boosting Network For 3D Pose Estimation

A context consistency gate (CCG) is introduced in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations, to improve the reliability of the features for representing each body part and enhance the LSTD module.

Unified Egocentric Recognition of 3 D Hand-Object Poses and Interactions

A single architecture is proposed that does not rely on external detection algorithms but rather is trained end-to-end on single images and further merge and propagate information in the temporal domain to infer interactions between hand and object trajectories and recognize actions.

3D Hand Pose Tracking and Estimation Using Stereo Matching

The quantitative evaluation demonstrates that the proposed stereo-based hand segmentation algorithm is suitable for the state-of-the-art hand pose tracking/estimation algorithms and the tracking quality is comparable to the use of active depth sensors under different challenging scenarios.