• Publications
  • Influence
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene
Feature Pyramid Networks for Object Detection
TLDR
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
The Caltech-UCSD Birds-200-2011 Dataset
CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization
Shape matching and object recognition using shape contexts
TLDR
This paper presents work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation, and proposes shape detection using a feature called shape context, which is descriptive of the shape of the object.
Behavior recognition via sparse spatio-temporal features
TLDR
It is shown that the direct 3D counterparts to commonly used 2D interest point detectors are inadequate, and an alternative is proposed, and a recognition algorithm based on spatio-temporally windowed data is devised.
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization
TLDR
This paper presents a simple yet effective approach that for the first time enables arbitrary style transfer in real-time, comparable to the fastest existing approach, without the restriction to a pre-defined set of styles.
Multimodal Unsupervised Image-to-Image Translation
TLDR
A Multimodal Unsupervised Image-to-image Translation (MUNIT) framework that assumes that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties.
Fast Feature Pyramids for Object Detection
TLDR
For a broad family of features, this work finds that features computed at octave-spaced scale intervals are sufficient to approximate features on a finely-sampled pyramid, and this approximation yields considerable speedups with negligible loss in detection accuracy.
Robust Object Tracking with Online Multiple Instance Learning
TLDR
It is shown that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can therefore lead to a more robust tracker with fewer parameter tweaks.
Visual tracking with online Multiple Instance Learning
TLDR
It is shown that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems, and can therefore lead to a more robust tracker with fewer parameter tweaks.
...
1
2
3
4
5
...