A survey of vision-based methods for action representation, segmentation and recognition
This paper describes a combined tracking-classification framework for the unsupervised classification of human action. While most existing approaches assume that featurewise correspondences on people are either available or not at all, this method explicitly formalizes how the probability of correspondences can be used in computation when the correspondences are ambiguous. It is also able to exploit in a probabilistic manner any foreground-background preprocessed segmentation, even if the segmentation is of low confidence. A principled analysis of the problem leads to a novel probabilistic action representation called the correspondence-ambiguous feature histogram array (CAFHA) that is robust to variations across similar actions. Our results show that the new framework outperforms the recent Zelnik-Manor and Irani method  for unsupervised event classi?cation. Additionally, the framework is extended to quasi real-time action inference, achieving good recognition accuracy despite changes in person identity and variations in the actions.