Learn More
This paper presents my work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation. In this paper, I propose shape detection using a feature called shape context. Shape context describes all boundary points of a shape with respect to any single boundary point. Thus it is descriptive(More)
—We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance(More)
A common trend in object recognition is to detect and leverage the use of sparse, informative feature points. The use of such features makes the problem more manageable while providing increased robustness to noise and pose variation. In this work we develop an extension of these ideas to the spatio­temporal case. For this purpose, we show that the direct(More)
In this paper, we address the problem of learning an adaptive appearance model for object tracking. In particular, a class of tracking techniques called “tracking by detection” have been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the(More)
Multi-resolution image features may be approximated via extrapolation from nearby scales, rather than being computed explicitly. This fundamental insight allows us to design object detection algorithms that are as accurate, and considerably faster, than the state-of-the-art. The computational bottleneck of many modern detectors is the computation of(More)
In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called “tracking by detection” has been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to(More)
CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization annotations. All images are annotated with bounding boxes, part locations, and attribute labels. Images and annotations were filtered by multiple users of(More)
This paper focuses on the problem of word detection and recognition in natural images. The problem is significantly more challenging than reading text in scanned documents, and has only recently gained attention from the computer vision community. Sub-components of the problem, such as text detection and cropped image word recognition, have been studied in(More)
Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation which provides a transformation from the raw pixel data to a small set of image regions which are coherent in color and texture. This " Blobworld " representation is created by clustering pixels in a(More)