• Publications
  • Influence
Feature Pyramid Networks for Object Detection
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles. Expand
Hypercolumns for object segmentation and fine-grained localization
Using hypercolumns as pixel descriptors, this work defines the hypercolumn at a pixel as the vector of activations of all CNN units above that pixel, and shows results on three fine-grained localization tasks: simultaneous detection and segmentation, and keypoint localization. Expand
Semantic contours from inverse detectors
A simple yet effective method for combining generic object detectors with bottom-up contours to identify object contours is presented and a principled way of combining information from different part detectors and across categories is provided. Expand
Simultaneous Detection and Segmentation
This work builds on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN), introducing a novel architecture tailored for SDS, and uses category-specific, top-down figure-ground predictions to refine the bottom-up proposals. Expand
Low-Shot Visual Recognition by Shrinking and Hallucinating Features
This work presents a low-shot learning benchmark on complex images that mimics challenges faced by recognition systems in the wild, and proposes representation regularization techniques and techniques to hallucinate additional training examples for data-starved classes. Expand
Discriminative Decorrelation for Clustering and Classification
Object detection has over the past few years converged on using linear SVMs over HOG features. Training linear SVMs however is quite expensive, and can become intractable as the number of categoriesExpand
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
This paper proposes to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking the LiDAR signal, and achieves impressive improvements over the existing state-of-the-art in image- based performance. Expand
Low-Shot Learning from Imaginary Data
This work builds on recent progress in meta-learning by combining a meta-learner with a "hallucinator" that produces additional training examples, and optimizing both models jointly, yielding state-of-the-art performance on the challenging ImageNet low-shot classification benchmark. Expand
PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows
A principled probabilistic framework to generate 3D point clouds by modeling them as a distribution of distributions with the invertibility of normalizing flows enables the computation of the likelihood during training and allows the model to train in the variational inference framework. Expand
Learning Features by Watching Objects Move
Inspired by the human visual system, low-level motion-based grouping cues can be used to learn an effective visual representation that significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce. Expand