• Publications
  • Influence
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition
TLDR
We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. Expand
  • 708
  • 176
Three things everyone should know to improve object retrieval
TLDR
The objective of this work is object retrieval in large scale image datasets, where the object is specified by an image query and retrieval should be immediate at run time. Expand
  • 1,168
  • 163
  • PDF
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition
TLDR
We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. Expand
  • 382
  • 101
  • PDF
All About VLAD
TLDR
The objective of this paper is large scale object instance retrieval, given a query image. Expand
  • 582
  • 84
  • PDF
Convolutional Neural Network Architecture for Geometric Matching
TLDR
We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation. Expand
  • 230
  • 53
  • PDF
Look, Listen and Learn
We consider the question: what can be learnt by looking at and listening to a large number of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the videoExpand
  • 288
  • 40
  • PDF
On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models
TLDR
We show how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy. Expand
  • 163
  • 30
  • PDF
24/7 Place Recognition by View Synthesis
TLDR
We address the problem of large-scale visual place recognition for situations where the scene undergoes a major change in appearance, for example, due to illumination (day/night), change of seasons, aging, or structural modifications over time such as buildings being built or destroyed. Expand
  • 141
  • 27
  • PDF
Objects that Sound
TLDR
In this paper our objectives are, first, networks that can embed audio and visual inputs into a common space that is suitable for cross-modal retrieval; and second, a network that can localize the object that sounds in an image, given the audio signal. Expand
  • 162
  • 23
  • PDF
End-to-End Weakly-Supervised Semantic Alignment
TLDR
We develop a convolutional neural network architecture for semantic alignment that is trainable in an end-to-end manner from weak image-level supervision in the form of matching image pairs. Expand
  • 77
  • 23
  • PDF