• Publications
  • Influence
Very Deep Convolutional Networks for Large-Scale Image Recognition
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks ofExpand
  • 37,559
  • 7660
The Pascal Visual Object Classes (VOC) Challenge
The Pascal Visual Object Classes (VOC) challenge is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset ofExpand
  • 7,904
  • 1169
Two-Stream Convolutional Networks for Action Recognition in Videos
We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information onExpand
  • 3,797
  • 693
Video Google: a text retrieval approach to object matching in videos
We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpointExpand
  • 6,334
  • 575
Deep Face Recognition
The goal of this paper is face recognition – from either a single photograph or from a set of faces tracked in a video. Recent progress in this area has been due to two factors: (i) end to endExpand
  • 2,867
  • 560
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance onExpand
  • 1,663
  • 515
Object retrieval with large vocabularies and fast spatial matching
In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that containExpand
  • 2,669
  • 468
Return of the Devil in the Details: Delving Deep into Convolutional Nets
The latest generation of Convolutional Neural Networks (CNN) have achieved impressive results in challenging benchmarks on image recognition and object detection, significantly raising the interestExpand
  • 2,571
  • 418
Spatial Transformer Networks
Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally andExpand
  • 2,856
  • 411
A Comparison of Affine Region Detectors
The paper gives a snapshot of the state of the art in affine covariant region detectors, and compares their performance on a set of test images under varying imaging conditions. Six types ofExpand
  • 3,176
  • 329