• Publications
  • Influence
Training Region-Based Object Detectors with Online Hard Example Mining
TLDR
We present a simple yet surprisingly effective online hard example mining (OHEM) algorithm that eliminates several heuristics and hyperparameters in common use. Expand
  • 1,243
  • 147
  • PDF
Cross-Stitch Networks for Multi-task Learning
TLDR
We present cross-stitch units which are a generalized way of learning shared representations for multi-task learning in ConvNets using multitask learning. Expand
  • 488
  • 74
  • PDF
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
TLDR
The success of deep learning in vision can be attributed to: (a) models with high capacity; (b) increased computational power; and (c) availability of large-scale labeled data. Expand
  • 735
  • 38
  • PDF
NEIL: Extracting Visual Knowledge from Web Data
TLDR
We propose NEIL (Never Ending Image Learner), a computer program that runs 24 hours per day and 7 days per week to automatically extract visual knowledge from Internet data to develop the world's largest visual structured knowledge base with minimum human labeling effort. Expand
  • 406
  • 29
  • PDF
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
TLDR
We propose to learn an adversarial network that generates examples with occlusions and deformations. Expand
  • 321
  • 25
  • PDF
Beyond Skip Connections: Top-Down Modulation for Object Detection
TLDR
Inspired by the human visual pathway, in this paper we propose top-down modulations as a way to incorporate fine details into the detection framework. Expand
  • 230
  • 25
  • PDF
Tracking Emerges by Colorizing Videos
TLDR
We use large amounts of unlabeled video to learn models for visual tracking without manual human supervision by using video colorization as a supervisory signal for learning to track. Expand
  • 150
  • 17
  • PDF
Data-driven visual similarity for cross-domain image matching
TLDR
We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". Expand
  • 245
  • 16
  • PDF
Enriching Visual Knowledge Bases via Object Discovery and Segmentation
TLDR
We propose a conceptually simple yet powerful approach that combines the power of generative modeling for segmentation with the effectiveness of discriminative models for detection to segment objects. Expand
  • 113
  • 15
  • PDF
Actor-Centric Relation Network
TLDR
We model spatio-temporal relations to capture the interactions between human actors, relevant objects and scene elements essential to differentiate similar human actions. Expand
  • 86
  • 11
  • PDF
...
1
2
3
...