Developing Age and Gender Predictive Lexica over Social Media

  title={Developing Age and Gender Predictive Lexica over Social Media},
  author={Maarten Sap and Gregory J. Park and Johannes C. Eichstaedt and Margaret L. Kern and David Stillwell and Michal Kosinski and Lyle H. Ungar and H. Andrew Schwartz},
Demographic lexica have potential for widespread use in social science, economic, and business applications. We derive predictive lexica (words and weights) for age and gender using regression and classification models from word usage in Facebook, blog, and Twitter data with associated demographic labels. The lexica, made publicly available,1 achieved state-of-the-art accuracy in language based age and gender prediction over Facebook and Twitter, and were evaluated for generalization across… CONTINUE READING
Highly Cited
This paper has 133 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 85 extracted citations

134 Citations

Citations per Year
Semantic Scholar estimates that this publication has 134 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 38 references

Google flu trends

  • Inc. Google.
  • http:// Accessed on…
  • 2014
1 Excerpt

Similar Papers

Loading similar papers…