ChaLearn Looking at People Challenge 2014: Dataset and Results

@inproceedings{Escalera2014ChaLearnLA,
  title={ChaLearn Looking at People Challenge 2014: Dataset and Results},
  author={Sergio Escalera and Xavier Bar{\'o} and Jordi Gonz{\`a}lez and Miguel {\'A}ngel Bautista and Meysam Madadi and Miguel Reyes and V{\'i}ctor Ponce-L{\'o}pez and Hugo Jair Escalante and Jamie Shotton and Isabelle Guyon},
  booktitle={ECCV Workshops},
  year={2014}
}
This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. [] Key Result Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.

Figures and Tables from this paper

ChaLearn Looking at People 2015 challenges: Action spotting and cultural event recognition
TLDR
This paper summarizes the two challenges to be presented at CVPR 2015: action/interaction spotting and cultural event recognition in RGB data and the obtained results.
ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results
TLDR
A crowd-sourcing application was developed to collect and label data about the apparent age of people (as opposed to the real age) and in terms of cultural event recognition, one hundred categories had to be recognized.
ChaLearn Looking at People: IsoGD and ConGD Large-Scale RGB-D Gesture Recognition
TLDR
This article proposes a bidirectional long short-term memory (Bi-LSTM) method, determining video division points based on skeleton points, and introduces the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition.
ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition
TLDR
Two large video multi-modal datasets for RGB and RGB-D gesture recognition are presented and the baseline method based on the bag of visual words model is presented, designed for gesture classification from segmented data.
ChaLearn Looking at People and Faces of the World: Face AnalysisWorkshop and Challenge 2016
TLDR
A custom-build application was used to collect and label data about the apparent age of people (as opposed to the real age) and the citizen-science Zooniverse platform was used for the Faces of the World data.
ChaLearn looking at people 2015 new competitions: Age estimation and cultural event recognition
TLDR
This paper proposes the first crowd-sourcing application to collect and label data about apparent age of people instead of the real age, which involves scene understanding and human analysis in terms of cultural event recognition.
Results and Analysis of ChaLearn LAP Multi-modal Isolated and Continuous Gesture Recognition, and Real Versus Fake Expressed Emotions Challenges
TLDR
This second round for both gesture recognition challenges, which were held first in the context of the ICPR 2016 workshop on "multimedia challenges beyond visual analysis", has considerably improved, and the performances considerably improved compared to the first round.
ChaLearn LAP 2016: First Round Challenge on First Impressions - Dataset and Results
TLDR
This paper summarizes the ChaLearn Looking at People 2016 First Impressions challenge data and results obtained by the teams in the first round of the competition, to automatically evaluate five “apparent” personality traits from videos of subjects speaking in front of a camera, by using human judgment.
ChaLearn looking at people: A review of events and resources
TLDR
The historic of ChaLearn Looking at People events is reviewed, and the ChaLearn LAP platform is introduced where public resources (including code, data and preprints of papers) related to the organized events are available.
Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination
TLDR
The goal of the approach is to identify semantically meaningful contents from dense sampling spatio-temporal feature space for gesture recognition and develops three concepts under the random forest framework: un-supervision; discrimination; and randomization.
...
...

References

SHOWING 1-10 OF 18 REFERENCES
ChaLearn multi-modal gesture recognition 2013: grand challenge and workshop summary
TLDR
A Grand Challenge and Workshop on Multi-Modal Gesture Recognition focused on the recognition of continuous natural gestures from multi-modal data (including RGB, Depth, user mask, Skeletal model, and audio) and a large labeled video database was made available.
Multi-modal gesture recognition challenge 2013: dataset and results
TLDR
A challenge on multi-modal gesture recognition with 54 international teams, providing the audio, skeletal model, user mask, RGB and depth images, and outstanding results were obtained by the first ranked participants.
Learning realistic human actions from movies
TLDR
A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.
Visual Analysis of Humans - Looking at People
TLDR
This unique text/reference provides a coherent and comprehensive overview of all aspects of video analysis of humans and reviews the historical origins of the different existing methods, and predicts future trends and challenges.
HuPBA8k+: Dataset and ECOC-Graph-Cut based segmentation of human limbs
2D Human Pose Estimation: New Benchmark and State of the Art Analysis
TLDR
A novel benchmark "MPII Human Pose" is introduced that makes a significant advance in terms of diversity and difficulty, a contribution that is required for future developments in human body models.
Action Recognition with Improved Trajectories
  • Heng Wang, C. Schmid
  • Computer Science
    2013 IEEE International Conference on Computer Vision
  • 2013
TLDR
Dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets are improved by taking into account camera motion to correct them.
The Pascal Visual Object Classes (VOC) Challenge
TLDR
The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.
MODEC: Multimodal Decomposable Models for Human Pose Estimation
TLDR
This paper proposes a multimodal, decomposable model of human pose that explicitly captures a variety of pose modes and outperforms state-of-the-art approaches across the accuracy-speed trade-off curve for several pose datasets.
Detailed Human Data Acquisition of Kitchen Activities: the CMU-Multimodal Activity Database (CMU-MMAC)
TLDR
A focused effort to capture detailed (high spatial and temporal resolution) human data in the kitchen while cooking several recipes and is currently used to solve problems of multimodal temporal segmentation of activities and activity recognition.
...
...