Learn More
In various computer vision applications, often we need to convert an image in one style into another style for better visualization, interpretation and recognition; for examples, up-convert a low resolution image to a high resolution one, and convert a face sketch into a photo for matching, etc. A semi-coupled dictionary learning (SCDL) model is proposed in(More)
Regularized linear representation learning has led to interesting results in image classification, while how the object should be represented is a critical issue to be investigated. Considering the fact that the different features in a sample should contribute differently to the pattern representation and classification, in this paper we present a novel(More)
In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5km 2 of land, 8439km of road and around 400, 000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. Manually labeling such a large scale dataset is infeasible.(More)
In recent years, contextual models that exploit maps have been shown to be very effective for many recognition and localization tasks. In this paper we propose to exploit aerial images in order to enhance freely available world maps. Towards this goal, we make use of OpenStreetMap and formulate the problem as the one of inference in a Markov random field(More)
In this paper we are interested in exploiting geographic priors to help outdoor scene understanding. Towards this goal we propose a holistic approach that reasons jointly about 3D object detection, pose estimation, semantic segmentation as well as depth reconstruction from a single image. Our approach takes advantage of large-scale crowd-sourced maps to(More)
In this paper we propose a novel approach to localization in very large indoor spaces (i.e., 200+ store shopping malls) that takes a single image and a floor plan of the environment as input. We formulate the localization problem as inference in a Markov random field, which jointly reasons about text detection (localizing shop's names in the image with(More)
In this paper we present an approach to enhance existing maps with fine grained segmentation categories such as parking spots and sidewalk, as well as the number and location of road lanes. Towards this goal, we propose an efficient approach that is able to estimate these fine grained categories by doing joint inference over both, monocular aerial imagery,(More)