Learn More
The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies-any overlapping pixels are constrained(More)
Recent years have seen greater interest in the use of discrim-inative classifiers in tracking systems, owing to their success in object detection. They are trained online with samples collected during tracking. Unfortunately, the potentially large number of samples becomes a computational burden, which directly conflicts with real-time requirements. On the(More)
Feature extraction, coding and pooling, are important components on many contemporary object recognition paradigms. In this paper we explore novel pooling techniques that encode the second-order statistics of local descriptors inside a region. To achieve this effect, we introduce multiplicative second-order analogues of average and max-pooling that together(More)
The Visual Object Tracking challenge 2014, VOT2014, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 38 trackers are 2 Authors Suppressed Due to Excessive Length presented. The number of tested trackers makes VOT 2014 the largest benchmark on short-term tracking to date. For(More)
Perspective camera calibration has been i n the last decades a research subject for a large group of researchers and as a result several camera calibration methodologies can be found i n the literature. However only a small number of those methods base their approaches o n the use of monoplane calibration points. This paper describes one of those(More)
We address the problem of populating object category detection datasets with dense, per-object 3D reconstructions , bootstrapped from class labels, ground truth figure-ground segmentations and a small set of keypoint annotations. Our proposed algorithm first estimates camera viewpoint using rigid structure-from-motion, then reconstructs object shapes by(More)
Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on(More)