Roberto Cipolla

Learn More
We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and test, especially compared with k-means clustering and(More)
We address the problem of comparing sets of images for object recognition, where the sets may represent variations in an object's appearance due to changing camera pose and lighting conditions. canonical correlations (also known as principal or canonical angles), which can be thought of as the angles between two d-dimensional subspaces, have recently(More)
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to(More)
We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized(More)
Psychophysical studies show that we can recognize objects using fragments of outline contour alone. This paper proposes a new automatic visual recognition system based only on local contour features, capable of localizing objects in space and scale. The system first builds a class-specific codebook of local fragments of contour using a novel formulation of(More)
ÐThis paper presents a novel framework for three-dimensional model-based tracking. Graphical rendering technology is combined with constrained active contour tracking to create a robust wire-frame tracking system. It operates in real time at video frame rate (25 Hz) on standard hardware. It is based on an internal CAD model of the object to be tracked which(More)
This paper presents a novel formulation for the multi-view scene reconstruction problem. While this formulation benefits from a volumetric scene representation, it is amenable to a computationally tractable global optimisation using Graph-cuts. The algorithm proposed uses the visual hull of the scene to infer occlusions and as a constraint on the topology(More)
0167-8655/$ see front matter 2008 Elsevier B.V. A doi:10.1016/j.patrec.2008.04.005 * Corresponding author. Address: Computer Visio bridge, United Kingdom. Fax: +44 1223 332662. E-mail address: brostow@cc.gatech.edu (G.J. Brost URL: http://mi.eng.cam.ac.uk/research/projects/Vid Visual object analysis researchers are increasingly experimenting with video,(More)
We present a novel categorical object detection scheme that uses only local contour-based features. A two-stage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of un-segmented images; the second stage bootstraps these detections to learn(More)
We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per(More)