• Corpus ID: 9859539

Tracking the Diffusion of Named Entities

  title={Tracking the Diffusion of Named Entities},
  author={Leon Derczynski and Matthew Rowe},
Existing studies of how information diffuses across social networks have thus far concentrated on analysing and recovering the spread of deterministic innovations such as URLs, hashtags, and group membership. However investigating how mentions of real-world entities appear and spread has yet to be explored, largely due to the computationally intractable nature of performing large-scale entity extraction. In this paper we present, to the best of our knowledge, one of the first pieces of work to… 
1 Citations
The anatomy of Reddit: An overview of academic research
In this survey, the main research directions that arose in recent years are mapped and point to different types of methodologies to extract information from the structure and dynamics of the system.


Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter
The first large-scale validation of the "complex contagion" principle from sociology, which posits that repeated exposures to an idea are particularly crucial when the idea is in some way controversial or contentious, is provided.
Learning influence probabilities in social networks
This paper proposes models and algorithms for learning the model parameters and for testing the learned models to make predictions, and develops techniques for predicting the time by which a user may be expected to perform an action.
Patterns of Cascading Behavior in Large Blog Graphs
Some surprising findings of the blog linking and information propagation structure are reported, after one of the largest available datasets, with 45, 000 blogs and ≈ 2.2 million blog-postings is analyzed.
Navigating the massive world of reddit: Using backbone networks to map user interests in social media
It is suggested that the integration of interest maps into popular social media platforms will assist users in organizing themselves into more specific interest groups, which will help alleviate the overcrowding effect often observed in large online communities.
Enhanced Information Access to Social Streams Through Word Clouds with Entity Grouping
This paper proposes a method for improving word cloud generation over social streams and finds that word clouds with grouped named entities attain significantly broader coverage and significantly decreased content duplication and supports MAP as a tool for predicting word cloud quality without requiring a human in the loop.
The role of social networks in information diffusion
It is shown that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information, suggesting that weak ties may play a more dominant role in the dissemination of information online than currently believed.
We know what @you #tag: does the dual role affect hashtag adoption?
This work proposes comprehensive measures to quantify the major factors of how a user selects content tags as well as joins communities, and proves the effectiveness of the dual role, where both the content measures and the community measures significantly correlate to hashtag adoption on Twitter.
Analysis of named entity recognition and linking for tweets
This work describes a new Twitter entity disambiguation dataset, and conducts an empirical analysis of named entity recognition and disambigsuation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.
On the endogenesis of Twitter's Spritzer and Gardenhose sample streams
Evidence is found for discovering the method used by Twitter to decide which tweets will show up in the random sample streams, and an overview of how Twitter's unique tweet IDs are generated and explain the regularities of each part of a tweet ID is provided.
Evolution of reddit: from the front page of the internet to a self-referential community?
Investigations suggest that Reddit has transformed itself from a dedicated gateway to the Web to an increasingly self-referential community that focuses on and reinforces its own user-generated image- and textual content over external sources.