On-device Real-time Hand Gesture Recognition
@article{Sung2021OndeviceRH, title={On-device Real-time Hand Gesture Recognition}, author={George Sung and Kanstantsin Sokal and Esha Uboweja and Valentin Bazarevsky and Jonathan Baccash and Eduard Gabriel Bazavan and Chuo-Ling Chang and Matthias Grundmann}, journal={ArXiv}, year={2021}, volume={abs/2111.00038} }
We present an on-device real-time hand gesture recognition (HGR) system, which detects a set of predefined static gestures from a single RGB camera. The system consists of two parts: a hand skeleton tracker and a gesture classifier. We use MediaPipe Hands [14, 2] as the basis of the hand skeleton tracker, improve the keypoint accuracy, and add the estimation of 3D keypoints in a world metric space. We create two different gesture classifiers, one based on heuristics and the other using neural…
2 Citations
BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation
- Computer ScienceArXiv
- 2022
The main contributions include i) a novel method for 3D ground truth data acquisition, ii) updated 3D body tracking with additional hand landmarks and iii) full body pose estimation from a monocular image.
Internet of things system for ultraviolet index monitoring in the community of Chirinche Bajo
- EconomicsREVISTA ODIGOS
- 2022
The impact of the ultraviolet radiation index is becoming more intense and dangerous for the health of the epidermis and eyesight of people, especially for farmers in the Chirinche Bajo (Ecuador)…
References
SHOWING 1-10 OF 14 REFERENCES
Skeleton-Based Dynamic Hand Gesture Recognition
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2016
The geometric shape of the hand is exploited to extract an effective descriptor from hand skeleton connected joints returned by the Intel RealSense depth camera to achieve the classification by a linear SVM classifier.
Hand Gesture Recognition Based on Computer Vision: A Review of Techniques
- Computer ScienceJ. Imaging
- 2020
A review of the literature on hand gesture techniques and introduces their merits and limitations under different circumstances, and tabulates the performance of these methods, focusing on computer vision techniques that deal with the similarity and difference points.
MediaPipe Hands: On-device Real-time Hand Tracking
- Computer ScienceArXiv
- 2020
A real-time on-device hand tracking pipeline that predicts hand skeleton from single RGB camera for AR/VR applications through MediaPipe, a framework for building cross-platform ML solutions.
Deep Learning for Hand Gesture Recognition on Skeletal Data
- Computer Science2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018)
- 2018
A new Convolutional Neural Network (CNN) where sequences of hand-skeletal joints' positions are processed by parallel convolutions is proposed where this model achieves a state-of-the-art performance on a challenging dataset.
Vision based hand gesture recognition for human computer interaction: a survey
- Computer ScienceArtificial Intelligence Review
- 2012
An analysis of comparative surveys done in the field of gesture based HCI and an analysis of existing literature related to gesture recognition systems for human computer interaction by categorizing it under different key parameters are provided.
Focal Loss for Dense Object Detection
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2020
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
Focal Loss for Dense Object Detection
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
- Computer ScienceECCV
- 2020
This paper shows that the proposed methods outperform the state of the art, supporting the practical construction of an accurate family of models based on large-scale training with diverse and incompletely labeled image and video data.
GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
A statistical, articulated 3D human shape modeling pipeline, within a fully trainable, modular, deep learning framework, that supports facial expression analysis, as well as body shape and pose estimation.
MediaPipe: A Framework for Building Perception Pipelines
- Computer ScienceArXiv
- 2019
This work shows that these features enable a developer to focus on the algorithm or model development and use MediaPipe as an environment for iteratively improving their application with results reproducible across different devices and platforms.