Learn More
In this paper, we extend existing work on latent attribute inference by leveraging the principle of homophily: we evaluate the inference accuracy gained by augmenting the user features with features derived from the Twitter profiles and postings of her friends. We consider three attributes which have varying degrees of assortativity: gender, age, and(More)
Despite significant work on the problem of inferring a Twitter user’s gender from her online content, no systematic investigation has been made into leveraging the most obvious signal of a user’s gender: first name. In this paper, we perform a thorough investigation of the link between gender and first name in English tweets. Our work makes several(More)
Geolocated social media data provides a powerful source of information about place and regional human behavior. Because little social media data is geolocation-annotated, inference techniques serve an essential role for increasing the volume of annotated data. One major class of inference approaches has relied on the social network of Twitter, where the(More)
SCIENCE sciencemag.org O n 3 November 1948, the day after Harry Truman won the United States presidential elections, the Chicago Tribune published one of the most f a m o u s e r r o n e o u s h e a d l i n e s i n newspaper history: “Dewey Defeats Truman” ( 1, 2). The headline was informed by telephone surveys, which had inadvertently undersampled Truman(More)
While much work has considered the problem of latent attribute inference for users of social media such as Twitter, little has been done on non-English-based content and users. Here, we conduct the first assessment of latent attribute inference in languages beyond English, focusing on gender inference. We find that the gender inference problem in quite(More)
Studying the control properties of complex networks provides insight into how designers and engineers can influence these systems to achieve a desired behavior. Topology of a network has been shown to strongly correlate with certain control properties; here we uncover the fundamental structures that explain the basis of this correlation. We develop the(More)
In order for a municipality to effectively service and engage its constituency, it must understand the composition of the communities within it. Up to the present, such demographic estimates for target populations have been obtained largely from census data or expensive, time-intensive surveys. In this paper, we use Twitter microblog content to estimate the(More)
Much work on the demographics of social media platforms such as Twitter has focused on the properties of individuals, such as gender or age. However, because credible detectors for organization accounts do not exist, these and future largescale studies of human behavior on social media can be contaminated by the presence of accounts belonging to(More)