Corpus ID: 235458333

The 2021 Image Similarity Dataset and Challenge

  title={The 2021 Image Similarity Dataset and Challenge},
  author={M. Douze and Giorgos Tolias and Ed Pizzi and Zoe Papakipos and L. Chanussot and Filip Radenovi{\'c} and T. Jen{\'i}cek and Maxim Maximov and L. Leal-Taix{\'e} and Ismail Elezi and O. Chum and C. Canton-Ferrer},
This paper introduces a new benchmark for largescale image similarity detection. This benchmark is used for the Image Similarity Challenge at NeurIPS’21 (ISC2021). The goal is to determine whether a query image is a modified copy of any image in a reference corpus of size 1 million. The benchmark features a variety of image transformations such as automated transformations, hand-crafted image edits and machine-learning based manipulations. This mimics real-life cases appearing in social media… Expand

Figures and Tables from this paper


Using extreme value theory for image detection
The primary target of content based image retrieval is to return a list of images that are most similar to a query image. This is usually done by ordering the images based on a similarity score. InExpand
Lost in quantization: Improving particular object retrieval in large scale image databases
The state of the art in visual object retrieval from large databases is achieved by systems that are inspired by text retrieval. A key component of these approaches is that local regions of imagesExpand
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets are addressed: in particular, annotation errors, the size of the dataset, and the level of challenge are addressed. Expand
Object retrieval with large vocabularies and fast spatial matching
To improve query performance, this work adds an efficient spatial verification stage to re-rank the results returned from the bag-of-words model and shows that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. Expand
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval
This paper brings query expansion into the visual domain via two novel contributions: strong spatial constraints between the query image and each result allow us to accurately verify each return, suppressing the false positives which typically ruin text-based query expansion. Expand
Large-Scale Image Retrieval with Attentive Deep Local Features
An attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELE (DEep Local Feature), based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset. Expand
Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search
Estimation of the full geometric transformation of bag-of-features in the framework of approximate nearest neighbor search is complementary to the weak geometric consistency constraints and allows to further improve the accuracy. Expand
Evaluation of GIST descriptors for web-scale image search
This paper evaluates the search accuracy and complexity of the global GIST descriptor for two applications, for which a local description is usually preferred: same location/object recognition and copy detection, and proposes an indexing strategy for global descriptors that optimizes the trade-off between memory usage and precision. Expand
VCDB: A Large-Scale Database for Partial Copy Detection in Videos
A large-scale video copy database (VCDB) with over 100,000 Web videos, containing more than 9,000 copied segment pairs found through careful manual annotation is introduced. Expand
Large-scale image retrieval with compressed Fisher vectors
This article shows why the Fisher representation is well-suited to the retrieval problem: it describes an image by what makes it different from other images, and why it should be compressed to reduce their memory footprint and speed-up the retrieval. Expand