Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets

  title={Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets},
  author={{\"O}zer {\"O}zdikis and Heri Ramampiaro and Kjetil N{\o}rv{\aa}g},
Predicting the locations of non-geotagged tweets is an active research area in geographical information retrieval. In this work, we propose a method to detect term co-occurrences in tweets that exhibit spatial clustering or dispersion tendency with significant deviation from the underlying single-term patterns, and use these co-occurrences to extend the feature space in probabilistic language models. We observe that using term pairs that spatially attract or repel each other yields significant… 

Locality-adapted Kernel Densities for Tweet Localization

A location prediction method for tweets based on the geographical probability distribution of their terms over a region using Kernel Density Estimation (KDE), which relies on statistical approaches without requiring any parameter tuning.

Inferring the geolocation of tweets at a fine-grained level

The effectiveness of using new geolocalised incident-related tweets in detecting the geolocation of real incidents reports is demonstrated, and it is demonstrated that the overall performance of the traffic incident detection task can be improved by enhancing the already available geotagged tweets with new tweets that wereGeolocalisation of tweets at a fine-grained level.

Constructing Geographic Dictionary from Streaming Geotagged Tweets

An online method for continuously update the geographic dictionary is proposed by adaptively determining suitable time intervals for examining the spatial locality of each word and filters out the geotagged posts from bot accounts based on the content similarity among their posts to improve the quality of extracted local words.

Geotagging tweets to landmarks using convolutional neural networks with text and posting time

A Convolutional Neural Network architecture for geotagging tweets to landmarks is proposed, based on the text in tweets and other meta information, such as posting time and source, which shows that the algorithm out-performed various state-of-the-art baselines.

A Transformer-based Framework for POI-level Social Post Geolocation

A transformer-based general framework is presented, which builds upon pre-trained language models and considers non-textual data, for social post geolocation at the POI level and demonstrates that three variants of the proposed framework outperform multiple state-of-art baselines in terms of accuracy and distance error metrics.

Geosocial Media as a Proxy for Security: A Review

A clarion call to the scientists, practitioners, and other stakeholders to enhance the capabilities and accuracy of security events detection and security situational awareness and assessment using geosocial media is provided.

A Personalized Search Query Generating Method for Safety-Enhanced Vehicle-to-People Networks

A Personalized Search Query Generator (PSQG) is constructed to reduce driver-mobile interaction during information retrieval in the 6 G era and can improve drivers’ safety if used in smartphones and other information retrieval systems in vehicles.



Text-Based Twitter User Geolocation Prediction

This paper presents an integrated geolocation prediction framework, and evaluates the impact of nongeotagged tweets, language, and user-declared metadata on geolocated prediction, and discusses how users differ in terms of their geolocatability.

A Survey of Location Prediction on Twitter

An overall picture of location prediction on Twitter is offered, concentrating on the prediction of user home locations, tweet locations, and mentioned locations, which defines the three tasks and reviews the evaluation metrics.

Inferring the origin locations of tweets with quantitative confidence

A scalable, content-based approach to estimate the location of tweets using a novel yet simple variant of gaussian mixture models, and it is shown that toponyms and languages with small geographic footprint provide the most useful location signals.

Where has this tweet come from? Fast and fine-grained geolocalization of non-geotagged tweets

This work proposes a framework for geolocating tweets that are not geotagged and aims at providing accurate geolocation estimates at fine grain (i.e., within a city) by exploiting the similarities in the content between this post and a set of geot tagged tweets.

Spatially Aware Term Selection for Geotagging

This paper investigates the use of kernel density estimation (KDE) to model each term as a two-dimensional probability distribution over the surface of the Earth, and proposes two classes of term selection techniques based on standard geostatistical methods based on Ripley's K statistic.

Modeling locations with social media

This paper proposes a statistical language modeling approach to identifying locations in arbitrary text, and investigates several ways to estimate the models, based on the term frequency and the user frequency, which show that estimation strategies based on user frequency are much more reliable than approachesbased on the raw term frequency.

A Latent Variable Model for Geographic Lexical Variation

A multi-level generative model that reasons jointly about latent topics and geographical regions is presented, which recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency.

Geo-spatial Domain Expertise in Microblogs

A novel way of casting the expertise problem by using points of interest POI as a possible categorization of expertise is investigated and a classification scheme that is able to reliably identify domain experts is designed.

Geo-temporal distribution of tag terms for event-related image retrieval

A Simple Scalable Neural Networks based Model for Geolocation Prediction in Twitter

The model classifies a tweet or a user to a city using a simple neural networks structure with fully-connected layers and average pooling processes and shows a promising extension of neural networks based models for geolocation prediction.