Learn More
In this paper we present a framework for pixel-wise object segmentation of road scenes that combines motion and appearance features. It is designed to handle street-level imagery such as that on Google Street View and Microsoft Bing Maps. We formulate the problem in a CRF framework in order to probabilistically model the label likelihoods and the a priori(More)
Computer vision algorithms for individual tasks such as object recognition , detection and segmentation have shown impressive results in the recent past. The next challenge is to integrate all these algorithms and address the problem of scene understanding. This paper is a step towards this goal. We present a probabilistic framework for reasoning about(More)
The problems of dense stereo reconstruction and object class segmentation can both be formulated as Random Field labeling problems, in which every pixel in the image is assigned a label corresponding to either its disparity, or an object class such as road or building. While these two problems are mutually informative, no attempt has been made to jointly(More)
The problems of object class segmentation [2], which assigns an object label such as road or building to every pixel in the image and dense stereo reconstruction, in which every pixel within an image is labelled with a disparity [1], are well suited for being solved jointly. Both approaches formulate the problem of providing a correct labelling of an image(More)
Recently, Krahenbuhl and Koltun proposed an efficient inference method for densely connected pairwise random fields using the mean-field approximation for a Conditional Random Field (CRF). However, they restrict their pairwise weights to take the form of a weighted combination of Gaussian kernels where each Gaussian component is allowed to take only zero(More)
On the one hand, mainly within the computer vision community, multi-resolution image labelling problems with pixel, super-pixel and object levels, have made great progress towards the modelling of holistic scene understanding. On the other hand, mainly within the robotics and graphics communities, multi-resolution 3<sup>D</sup> representations of the world(More)
This paper describes a method for producing a semantic map from multi-view street-level imagery. We define a semantic map as an overhead, or bird's eye view of a region with associated semantic object labels, such as car, road and pavement. We formulate the problem using two conditional random fields. The first is used to model the semantic image(More)
Linear SVMs are efficient in both training and testing, however the data in real applications is rarely linearly separable. Non-linear kernel SVMs are too computationally intensive for applications with large-scale data sets. Recently locally linear classifiers have gained popularity due to their efficiency whilst remaining competitive with kernel methods.(More)
The concepts of objects and attributes are both important for describing images precisely, since verbal descriptions often contain both adjectives and nouns (e.g. 'I see a shiny red chair'). In this paper, we formulate the problem of joint visual attribute and object class image segmentation as a dense multi-labelling problem, where each pixel in an image(More)