Jürgen Pfeffer

Learn More
Twitter is a social media giant famous for the exchange of short, 140-character messages called “tweets”. In the scientific community, the microblogging site is known for openness in sharing its data. It provides a glance into its millions of users and billions of tweets through a “Streaming API” which provides a sample of all tweets matching some(More)
SCIENCE sciencemag.org O n 3 November 1948, the day after Harry Truman won the United States presidential elections, the Chicago Tribune published one of the most f a m o u s e r r o n e o u s h e a d l i n e s i n newspaper history: “Dewey Defeats Truman” ( 1, 2). The headline was informed by telephone surveys, which had inadvertently undersampled Truman(More)
This paper presents a study of the life cycle of news articles posted online. We describe the interplay between website visitation patterns and social media reactions to news content. We show that we can use this hybrid observation method to characterize distinct classes of articles. We also find that social media reactions can help predict future(More)
A lot of centrality measures have been developed to analyze different aspects of importance. Some of the most popular centrality measures (e.g. betweenness centrality, closeness centrality) are based on the calculation of shortest paths. This characteristic limits the applicability of these measures for larger networks. In this article we elaborate on the(More)
Disaster response agencies incorporate social media as a source of fast-breaking information to understand the needs of people affected by the many crises that occur around the world. These agencies look for tweets from within the region affected by the crisis to get the latest updates on the status of the affected region. However only 1% of all tweets are(More)
Twitter shares a free 1% sample of its tweets through the "Streaming API". Recently, research has pointed to evidence of bias in this source. The methodologies proposed in previous work rely on the restrictive and expensive Firehose to find the bias in the Streaming API data. We tackle the problem of finding sample bias without costly and restrictive(More)
Geotagged tweets are an exciting and increasingly popular data source, but like all social media data, they potentially have biases in who are represented. Motivated by this, we investigate the question, ‘are users of geotagged tweets randomly distributed over the US population’? We link approximately 144 million geotagged tweets within the US, representing(More)
The visualization and analysis of dynamic social networks are challenging problems, demanding the simultaneous consideration of relational and temporal aspects. In order to follow the evolution of a network over time, we need to detect not only which nodes and which links change and when these changes occur, but also the impact they have on their(More)