Learn More
There is a large variety of trackers, which have been proposed in the literature during the last two decades with some mixed success. Object tracking in realistic scenarios is a difficult problem, therefore, it remains a most active area of research in computer vision. A good tracker should perform well in a large number of videos involving illumination(More)
Data association is an essential component of any human tracking system. The majority of current methods, such as bipartite matching, incorporate a limited-temporal-locality of the sequence into the data association problem, which makes them inherently prone to ID-switches and difficulties caused by long-term occlusion, cluttered background , and crowded(More)
Single camera-based multiple-person tracking is often hindered by difficulties such as occlusion and changes in appearance. In this paper, we address such problems by proposing a robust part-based tracking-by-detection framework. Human detection using part models has become quite popular, yet its extension in tracking has not been fully explored. Our(More)
Data association is the backbone to many multiple object tracking (MOT) methods. Reviewing the literature, most data association methods have considered a simplified version of the problem and focused on approximate inference methods which can be solved efficiently[2, 3]. On the other side, those algorithms which incorporate more accurate formulation of(More)
In this paper we show that multiple object tracking (MOT) can be formulated in a framework, where the detection and data-association are performed simultaneously. Our method allows us to overcome the confinements of data association based MOT approaches; where the performance is dependent on the object detection results provided at input level. At the core(More)
We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like(More)
A video captures a sequence and interactions of concepts that can be static, for instance, objects or scenes, or dynamic , such as actions. For large datasets containing hundreds of thousands of images or videos, it is impractical to manually annotate all the concepts, or all the instances of a single concept. However, a dictionary with visually-distinct(More)
Recent years have seen a major push for face recognition technology due to the large expansion of image sharing on social networks. In this paper, we consider the difficult task of determining parent-offspring resemblance using deep learning to answer the question " Who do I look like? " Although humans can perform this job at a rate higher than chance, it(More)
Manual analysis of pedestrians and crowds is often impractical for massive datasets of surveillance videos. Automatic tracking of humans is one of the essential abilities for computerized analysis of such videos. In this keynote paper, we present two state of the art methods for automatic pedestrian tracking in videos with low and high crowd density. For(More)
Complex event recognition is an expanding research area aiming to recognize entities of high-level semantics in videos. Typical approaches exploit the so-called " bags " of spatio-temporal features such as STIP, ISA and DTF-HOG; yet, more recently, the notion of concept has emerged as an alternative , intermediate representation with greater descriptive(More)