Learn More
—We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance(More)
Object segmentation requires both object-level information and low-level pixel data. This presents a challenge for feedforward networks: lower layers in convolutional nets capture rich spatial information , while upper layers encode object-level knowledge but are invariant to factors such as pose and appearance. In this work we propose to augment(More)
—In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption(More)
The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available.(More)
The recent availability of large amounts of geotagged imagery has inspired a number of data driven solutions to the image geolocalization problem. Existing approaches predict the location of a query image by matching it to a database of georeferenced photographs. While there are many geotagged images available on photo sharing and street view sites, most(More)
The recent MS COCO object detection dataset presents several new challenges for object detection. In particular, it contains objects at a broad range of scales, less proto-typical images, and requires more precise localization. To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give(More)
Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations , in part because they are compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct(More)
Metric learning algorithms produce distance metrics that capture the important relationships among data. In this work, we study the connection between metric learning and collaborative filtering. We propose Collaborative Metric Learning (CML) which learns a joint metric space to encode not only users' preferences but also the user-user and item-item(More)
Neurogenesis via the activation of endogenous neural progenitor cells is a potential treatment strategy for brain injury, including intracerebral hemorrhage (ICH). We assessed the efficacy of combined cell and brain-derived neurotrophic factor (BDNF) treatment in a mouse model of ICH induced by intracerebral collagenase injection. Complementary DNAs of(More)
We introduce a novel approach to cross camera people counting that can adapt itself to a new environment without the need of manual inspection. The proposed counting model is composed of a pair of collaborative Gaussian processes (GP), which are respectively designed to count people by taking the visible and occluded parts into account. While the first GP(More)