Hierarchical Discriminative Classification for Text-Based Geolocation


Text-based document geolocation is commonly rooted in language-based information retrieval techniques over geodesic grids. These methods ignore the natural hierarchy of cells in such grids and fall afoul of independence assumptions. We demonstrate the effectiveness of using logistic regression models on a hierarchy of nodes in the grid, which improves upon the state of the art accuracy by several percent and reduces mean error distances by hundreds of kilometers on data from Twitter, Wikipedia, and Flickr. We also show that logistic regression performs feature selection effectively, assigning high weights to geocentric terms.

Extracted Key Phrases

11 Figures and Tables

Citations per Year

Citation Velocity: 11

Averaging 11 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@inproceedings{Wing2014HierarchicalDC, title={Hierarchical Discriminative Classification for Text-Based Geolocation}, author={Benjamin Wing and Jason Baldridge}, booktitle={EMNLP}, year={2014} }