Knowing the Tweeters: Deriving Sociologically Relevant Demographics from Twitter

  title={Knowing the Tweeters: Deriving Sociologically Relevant Demographics from Twitter},
  author={Luke S Sloan and Jeffrey Morgan and William Housley and Matthew Leighton Williams and Adam Edwards and Peter Burnap and Omer Farooq Rana},
  journal={Sociological Research Online},
  pages={74 - 84}
A perennial criticism regarding the use of social media in social science research is the lack of demographic information associated with naturally occurring mediated data such as that produced by Twitter. However the fact that demographics information is not explicit does not mean that it is not implicitly present. Utilising the Cardiff Online Social Media ObServatory (COSMOS) this paper suggests various techniques for establishing or estimating demographic data from a sample of more than 113… 
Using Twitter data for demographic research
Best practices for estimating Twitter users’ basic demographic characteristics and a calibration method to address the selection bias in the Twitter population are proposed, allowing researchers to generalize findings based on Twitter to the general population.
Who Tweets in the United Kingdom? Profiling the Twitter Population Using the British Social Attitudes Survey 2015
Results from the British Social Attitudes Survey (BSA) 2015 on Twitter use are reported, finding that there are a disproportionate number of male Twitter users, in relation to both the Census 2011 and previous proxy estimates; that Twitter users are predominantly young, but there are more older users than previously estimated; and that there is strong class effects associated with Twitter use.
Who Tweets? Deriving the Demographic Characteristics of Age, Occupation and Social Class from Twitter User Meta-Data
The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users.
Geolocated Social Media Posts are Happier: Understanding the Characteristics of Check-in Posts on Twitter
It is shown that geotagged posts on Twitter exhibit significantly more positivity, are often about joyous and special events such as weddings or grad- uations, convey more collectivism rather than individualism, and contain more additional features such as hashtags or ob- jects in images, but at the same time generate substantially less engagement.
Who Tweets with Their Location? Understanding the Relationship between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter
There are significant demographic variations between those who opt in to geoservices and those who geotag their tweets and it is suggested that Twitter users who publish geographical information are not representative of the wider Twitter population.
Analyzing the EU Migration Crisis as Reflected on Twitter
Twitter data is used to analyze and visualize tweets about the migration crisis in the European Union from 2016 to 2021 to use a methodology to structure data for better understanding of complex social media data.
Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation
Views of Twitter users through analysis of online survey data; the effect of context collapse and online disinhibition on the behaviours of users; and the publication of identifiable sensitive classifications derived from algorithms are brought to the fore.
Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data
A generic methodology for investigating the representativeness of geo-social media data for population groups of similar statistical predictive power based on reference data, which shows that densely populated areas tend to be underrepresented consistently in non-spatial models.
Who are Political Retweeters?, Demographic comparison of political retweeters with retweeters of non-political personalities
This paper attempts to fill gaps in the literature regarding the demographics of political retweeters using various techniques on the name and location-related data from most active French political retweeting accounts.
Encountering #Feminism on Twitter: Reflections on a Research Collaboration between Social Scientists and Computer Scientists
The growth of social media presents an unparalleled opportunity for the study of social change. However, the speed and scale of this growth presents challenges for social scientists, particularly


Mapping the Australian Networked Public Sphere
This article reports on a research program that has developed new methodologies for mapping the Australian blogosphere and tracking how information is disseminated across it. The authors improve on
The Coming Crisis of Empirical Sociology
This ar ticle argues that in an age of knowing capitalism, sociologists have not adequately thought about the challenges posed to their expertise by the proliferation of `social' transactional data
De-anonymizing Social Networks
A framework for analyzing privacy and anonymity in social networks is presented and a new re-identification algorithm targeting anonymized social-network graphs is developed, showing that a third of the users who can be verified to have accounts on both Twitter and Flickr can be re-identified in the anonymous Twitter graph.
Gender, Identity, and Language Use in Teenage Blogs
The results suggest that teenagers stay closer to reality in their online expressions of self than has previously been suggested, and that these explorations involve issues, such as learning about their sexuality, that commonly occur during the adolescent years.
Detecting Social Spam Campaigns on Twitter
This work designs an automatic classification system based on machine learning, and applies multiple features for classifying spam campaigns, and demonstrates the efficacy of the proposed classification system.
Some Further Reflections on the Coming Crisis of Empirical Sociology
We respond to the two comments on our article `The Coming Crisis of Empirical Sociology' from Rosemary Crompton (2008) and Richard Webber (2009) which have been published in Sociology , as well as
Earthquake shakes Twitter users: real-time event detection by social sensors
This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Gender Identification, Interdependence, and Pseudonyms in CMC: Language Patterns in an Electronic Conference
Analysis of conference transcripts and pseudonym choices indicated that women tended to mask their gender with their pseudonym choice while males did not, and women in both forums generally tended to exhibit certain dimensions of social interdependence more frequently than men.
Predicting gender from electronic discourse.
It is established that people use gender-preferential language in informal electronic discourse and readers of these messages can use these gender-linked language differences to identify the author's gender.