Learn More
—We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to(More)
We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and test, especially compared with k-means clustering and(More)
We address the problem of comparing sets of images for object recognition, where the sets may represent variations in an object's appearance due to changing camera pose and lighting conditions. canonical correlations (also known as principal or canonical angles), which can be thought of as the angles between two d-dimensional subspaces, have recently(More)
We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized(More)
Visual object analysis researchers are increasingly experimenting with video, because it is expected that motion cues should help with detection, recognition, and other analysis tasks. This paper presents the Cambridge-driving Labeled Video Database (CamVid) as the first collection of videos with object class semantic labels, complete with metadata. The(More)
ÐThis paper presents a novel framework for three-dimensional model-based tracking. Graphical rendering technology is combined with constrained active contour tracking to create a robust wire-frame tracking system. It operates in real time at video frame rate (25 Hz) on standard hardware. It is based on an internal CAD model of the object to be tracked which(More)
We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per(More)
We propose a novel deep architecture, SegNet, for semantic pixel wise image labelling 1. SegNet has several attractive properties; (i) it only requires forward evaluation of a fully learnt function to obtain smooth label predictions, (ii) with increasing depth, a larger context is considered for pixel labelling which improves accuracy, and (iii) it is easy(More)
The spatiotemporal analysis of deforming silhouettes (apparent contours) is here extended using the mathematics of perspective projections and tools from differential geometry. Analysis of the image motion of a silhouette or apparent contour enables computation of local surface curvature along the corresponding contour generator on the surface, assuming(More)
Psychophysical studies show that we can recognize objects using fragments of outline contour alone. This paper proposes a new automatic visual recognition system based only on local contour features, capable of localizing objects in space and scale. The system first builds a class-specific codebook of local fragments of contour using a novel formulation of(More)