Fast and accurate detection and localization of abnormal behavior in crowded scenes
Recently the histogram of oriented tracklets (HOT) was shown to be an efficient video representation for abnormality detection and achieved state-of-the-arts on the available datasets. Unlike standard video descriptors that mainly employ low level motion features, e.g. optical flow, the HOT descriptor simultaneously encodes magnitude and orientation of tracklets as a mid-level representation over crowd motions. However, extracting tracklets in HOT suffers from poor salient point initialization and tracking drift in the presence of occlusion. Moreover, count-based HOT histogramming does not properly take into account the motion characteristics of abnormal motions. This paper extends the HOT by addressing these drawbacks introducing an enhanced version of HOT, named Improved HOT. First, we propose to initialize salient points in each frame instead of the first frame, as the HOT does. Second, we replace the naive count-based histogramming by the richer statistics of crowd movement (i.e., motion distribution). The evaluation of the Improved HOT on different datasets, namely UCSD, BEHAVE and UMN, yields compelling results in abnormality detection, by outperforming the original HOT and the state-of-the-art descriptors based on optical flow, dense trajectories and the social force models.