• Corpus ID: 246431014

Learning Super-Features for Image Retrieval

@article{Weinzaepfel2022LearningSF,
  title={Learning Super-Features for Image Retrieval},
  author={Philippe Weinzaepfel and Thomas Lucas and Diane Larlus and Yannis Kalantidis},
  journal={ArXiv},
  year={2022},
  volume={abs/2201.13182}
}
Methods that combine local and global features have recently shown excellent performance on multiple challenging deep image retrieval benchmarks, but their use of local features raises at least two issues. First, these local features simply boil down to the localized map activations of a neural network, and hence can be extremely redundant. Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local… 
Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization
TLDR
This paper contributes with the Danish Airs and Grounds (DAG) dataset, a large collection of street-level and aerial images targeting place recognition and visual localization in wide baseline configurations, and proposes a map-to-image re-localization pipeline.

References

SHOWING 1-10 OF 53 REFERENCES
Unifying Deep Local and Global Features for Image Search
TLDR
This work unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction, and introduces an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance.
Aggregating Deep Convolutional Features for Image Retrieval
TLDR
This paper investigates possible ways to aggregate local deep features to produce compact global descriptors for image retrieval and shows that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated.
Large-Scale Image Retrieval with Attentive Deep Local Features
TLDR
An attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELE (DEep Local Feature), based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset.
Deep Image Retrieval: Learning Global Representations for Image Search
TLDR
This work proposes a novel approach for instance-level image retrieval that produces a global and compact fixed-length representation for each image by aggregating many region-wise descriptors by leveraging a ranking framework and projection weights to build the region features.
Fine-Tuning CNN Image Retrieval with No Human Annotation
TLDR
It is shown that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval.
Particular object retrieval with integral max-pooling of CNN activations
TLDR
This work revisits both retrieval stages, namely initial search and re-ranking, by employing the same primitive information derived from the CNN, and significantly improves existing CNN-based recognition pipeline.
SOLAR: Second-Order Loss and Attention for Image Retrieval
TLDR
The results show that the two second-order components complement each other, bringing significant performance improvements in both tasks and lead to state-of-the-art results across the public benchmarks.
Learning and aggregating deep local descriptors for instance-level recognition
TLDR
The experimental validation studies the trade-off between performance and memory requirements of the state-of-the-art image search approach based on match kernels and demonstrates why such an approach learns local descriptors that work well for image similarity estimation with classical efficient match kernels.
Learning With Average Precision: Training Image Retrieval With a Listwise Loss
TLDR
This paper proposes to directly optimize the global mAP by leveraging recent advances in listwise loss formulations, using a histogram binning approximation, the AP can be differentiated and thus employed to end-to-end learning.
Supervised mid-level features for word image representation
  • Albert Gordo
  • Computer Science
    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
This paper proposes to learn local mid-level features suitable for building word image representations by leveraging character bounding box annotations on a small set of training images, and achieves results comparable with or better than the state-of-the-art on matching and recognition tasks using global descriptors of only 96 dimensions.
...
...