Fast Hand Detection in Collaborative Learning Environments

@inproceedings{Teeparthi2021FastHD,
  title={Fast Hand Detection in Collaborative Learning Environments},
  author={Sravani Teeparthi and Venkatesh Jatla and Marios S. Pattichis and Sylvia Celed{\'o}n-Pattichis and Carlos LopezLeiva},
  booktitle={CAIP},
  year={2021}
}
Long-term object detection requires the integration of framebased results over several seconds. For non-deformable objects, long-term detection is often addressed using object detection followed by video tracking. Unfortunately, tracking is inapplicable to objects that undergo dramatic changes in appearance from frame to frame. As a related example, we study hand detection over long video recordings in collaborative learning environments. More specifically, we develop long-term hand detection… 

Figures and Tables from this paper

Person Detection in Collaborative Group Learning Environments Using Multiple Representations
We introduce the problem of detecting a group of students from classroom videos. The problem requires the detection of students from different angles and the separation of the group from other groups
Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data
TLDR
A bilingual speech recognition system that uses an interactive video analysis system to estimate the 3D speaker geometry for realistic audio simulations and demonstrates the use of the system in generating a complex audio dataset that contains significant cross-talk and background noise that approximate real-life classroom recordings.

References

SHOWING 1-10 OF 21 REFERENCES
You Only Look Once: Unified, Real-Time Object Detection
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Facial Recognition in Collaborative Learning Videos
TLDR
This work develops a dynamic system of recognizing participants in collaborative learning systems by associating each participant with a collection of prototype faces computed through sampling or K-means clustering and addresses occlusion and recognition failures by using past information about the face detection history.
SSD: Single Shot MultiBox Detector
TLDR
The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Robust Head Detection in Collaborative Learning Environments Using AM-FM Representations
TLDR
Two new methods based on Amplitude Modulation-Frequency Modulation (AM-FM) models are proposed for robust head detection in collaborative learning environments and a combined approach based on color and FM texture is developed for robust face detection.
Talking Detection In Collaborative Learning Environments
TLDR
The proposed approach uses head detection and projections of the log-magnitude of optical flow vectors to reduce the problem to a simple classification of small projection images without the need for training complex, 3-D activity classification systems.
Hand Movement Detection in Collaborative Learning Environment Videos
TLDR
This thesis explores detection of hand movement using color and optical flow using patch color classification, space-time patches of video, and histogram of optical flow to solve the problem of human activity detection in digital videos.
Soft-NMS — Improving Object Detection with One Line of Code
TLDR
Soft-NMS is proposed, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M and improves state-of-the-art in object detection from 39.8% to 40.9% with a single model.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Person Detection in Collaborative Group Learning Environments Using Multiple Representations
We introduce the problem of detecting a group of students from classroom videos. The problem requires the detection of students from different angles and the separation of the group from other groups
Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data
TLDR
A bilingual speech recognition system that uses an interactive video analysis system to estimate the 3D speaker geometry for realistic audio simulations and demonstrates the use of the system in generating a complex audio dataset that contains significant cross-talk and background noise that approximate real-life classroom recordings.
...
1
2
3
...