• Publications
  • Influence
Microsoft COCO: Common Objects in Context
TLDR
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. Expand
  • 12,142
  • 2415
  • PDF
Feature Pyramid Networks for Object Detection
TLDR
In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. Expand
  • 5,276
  • 1168
  • PDF
The Caltech-UCSD Birds-200-2011 Dataset
TLDR
Birds-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. Expand
  • 1,879
  • 709
  • PDF
Shape matching and object recognition using shape contexts
TLDR
This paper presents my work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation. Expand
  • 5,645
  • 512
  • PDF
Behavior recognition via sparse spatio-temporal features
TLDR
We show that the direct 3D counterparts to commonly used 2D interest point detectors are inadequate for detection of spatio-temporal feature points and propose an alternative. Expand
  • 2,630
  • 370
  • PDF
Fast Feature Pyramids for Object Detection
TLDR
A fast feature pyramid approximation of multi-resolution image features at every scale of a finely-sampled image pyramid with negligible loss in detection accuracy. Expand
  • 1,535
  • 280
  • PDF
Robust Object Tracking with Online Multiple Instance Learning
TLDR
We address the problem of tracking an object in a video given its location in the first frame and no other information. Expand
  • 1,887
  • 273
  • PDF
Visual tracking with online Multiple Instance Learning
TLDR
We address the problem of learning an adaptive appearance model for object tracking that achieves superior results with real-time performance with fewer parameter tweaks. Expand
  • 1,444
  • 270
  • PDF
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization
TLDR
In this paper, we present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. Expand
  • 1,046
  • 221
  • PDF
Multimodal Unsupervised Image-to-Image Translation
TLDR
We propose a principled framework for the Multimodal UNsupervised Image-to-image Translation (MUNIT) problem, which achieves quality and diversity superior to state-of-the-art supervised approach. Expand
  • 846
  • 219
  • PDF