Learn More
Detecting and reading text from natural images is a hard computer vision task that is central to a variety of emerging applications. Related problems like document character recognition have been widely studied by computer vision and machine learning researchers and are virtually solved for practical applications like reading handwritten digits. Reliably(More)
We describe Photo OCR, a system for text extraction from images. Our particular focus is reliable text extraction from smartphone imagery, with the goal of text recognition as a user input modality similar to speech recognition. Commercially available OCR performs poorly on this task. Recent progress in machine learning has substantially improved isolated(More)
Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system. This paper leverages the vast amount of multimedia data on the Web,(More)
The last two years have witnessed the introduction and rapid expansion of products based upon large, systematically-gathered, street-level image collections, such as Google Street View, EveryScape, and Mapjack. In the process of gathering images of public spaces, these projects also capture license plates, faces, and other information considered sensitive(More)
We address the problem of estimating human pose in video sequences, where rough location has been determined. We exploit both appearance and motion information by defining suitable features of an image and its temporal neighbors, and learning a regression map to the parameters of a model of the human body using boosting techniques. Our algorithm can be(More)
We address the problem of performing decision tasks and, in particular, classification and recognition in the space of dynamical models in order to compare time series of data. Motivated by the application of recognition of human motion in image sequences, we consider a class of models that include linear dynamics, both stable and marginally stable(More)
We propose a hybrid dynamical model of human motion and develop a classification algorithm for the purpose of analysis and recognition. We assume that some temporal statistics are extracted from the images, and use them to infer a dynamical model that explicitly represents ground contact events. Such events correspond to “switches” between symmetric sets of(More)
We propose a simple model of human motion as a switching linear dynamical system where the switches correspond to contact forces with the ground. This significantly improves the modeling performance when compared to simpler linear systems, with only marginal increase in complexity. We introduce a novel closed-form (non-iterative) algorithm to estimate the(More)