Location Sensitive Image Retrieval and Tagging

  title={Location Sensitive Image Retrieval and Tagging},
  author={Raul Gomez and Jaume Gibert and Llu{\'i}s G{\'o}mez and Dimosthenis Karatzas},
People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies… 
1 Citations

Geography-Aware Self-Supervised Learning

This paper proposes novel training methods that exploit the spatio-temporal structure of remote sensing data and leverage spatially aligned images over time to construct temporal positive pairs in contrastive learning and geo-location to design pre-text tasks.



Geo-location inference from image content and user tags

This paper uses a large collection of over a million geotagged photographs to build location probability maps of user tags over the entire globe, which reflect the picture-taking and tagging behaviors of thousands of users from all over the world, and reveal interesting tag map patterns.

Combination of content analysis and context features for digital photograph retrieval.

This retrieval scenario, where a user is searching for photos of a known building or monument in a large shared collection, content-based techniques can offer a significant improvement over ranking based on context (specifically location) alone.

Presence-Only Geographical Priors for Fine-Grained Image Classification

An efficient spatio-temporal prior is proposed, that when conditioned on a geographical location and time, estimates the probability that a given object category occurs at that location.

Beyond Instance-Level Image Retrieval: Leveraging Captions to Learn a Global Visual Representation for Semantic Retrieval

This work shows that, despite its subjective nature, the task of semantically ranking visual scenes is consistently implemented across a pool of human annotators and forms a good computable surrogate for semantic image retrieval in complex scenes.

Tag Completion for Image Retrieval

This work proposes a new algorithm for tag completion, where the goal is to automatically fill in the missing tags as well as correct noisy tags for given images and represents the image-tag relation by a tag matrix, and search for the optimal tag matrix consistent with both the observed tags and the visual similarity.

Personalized Geo-Specific Tag Recommendation for Photos on Social Websites

This work focuses on the personalized tag recommendation task and tries to identify user-preferred, geo-location-specific as well as semantically relevant tags for a photo by leveraging rich contexts of the freely available community-contributed photos.

Geotagging in multimedia and computer vision—a survey

This paper surveys geo-tagging related research within the context of multimedia and along three dimensions: modalities in which geographical information can be extracted, applications that can benefit from the use of geographical information, and the interplay between modalities and applications.

Joint Image-Text Representation by Gaussian Visual-Semantic Embedding

A novel Gaussian Visual-Semantic Embedding model is presented, which leverages the visual information to model text concepts as Gaussian distributions in semantic space, with higher accuracy and better robustness.

PlaNet - Photo Geolocation with Convolutional Neural Networks

This work subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images, and shows that the resulting model, called PlaNet, outperforms previous approaches and even attains superhuman accuracy in some cases.

Spirittagger: a geo-aware tag suggestion tool mined from flickr

Experiments on a data set consisting of over 100,000 Flickr photos in Los Angeles and Southern California show that the geographically relevant tag suggestion tool provides a significant improvement in precision-recall performance over baseline image-based similarity suggestion.