What is Tumblr: a statistical overview and comparison

  title={What is Tumblr: a statistical overview and comparison},
  author={Yi Chang and Lei Tang and Yoshiyuki Inagaki and Yan Liu},
Tumblr, as one of the most popular microblogging platforms, has gained momentum recently. [] Key Result This work serves as an early snapshot of Tumblr that later work can leverage.

Spider and the Flies : Focused Crawling on Tumblr to Detect Hate Promoting Communities

A topic based web crawler primarily consisting of multiple phases: training a text classifier model consisting examples of only hate promoting users, extracting posts of an unknown tumblr micro-blogger, classifying hate promoting bloggers based on their activity feeds and performing a social network analysis on connected extremist bloggers is proposed.

Uncovering Hidden Communities of Extremist Micro-Bloggers : A Case Study of Jihadist Groups on Tumblr

This work proposes a topical crawler based approach performing several tasks: searching for a blogger, computing its similarity against exemplary documents, filtering hate promoting bloggers, navigating through links to other bloggers and managing a queue of such bloggers for social network analysis.

What people study when they study Tumblr: Classifying Tumblr-related academic research

The breadth of topics covered by social media researchers, which allows us to understand popular online platforms, is highlighted, and practical barriers to research on the Tumblr platform including lack of metadata and access to big data are identified.

Measurement and Modeling of Tumblr Traffic

This work uses a combination of active and passive approaches to network traffic measurement, and develops and calibrates a synthetic workload model for Tumblr network traffic.

Tumblr Blog Recommendation with Boosted Inductive Matrix Completion

A novel boosted inductive matrix completion method (BIMC) for blog recommendation using an additive low-rank model for user-blog preferences consisting of two components; one component captures the low- rank structure of follow relationships and the other captures the latent structure using side-information.

Deconstructing Diffusion on Tumblr: Structural and Temporal Aspects

This paper examines cascade networks on Tumblr, recreated from the series of diffusion events, and analyses them from structural and temporal perspectives to achieve a cascade construction model that create cascade networks, overcoming problems of a lack of contextual information and missing/degraded data.

Libraries and Tumblr: a quantitative analysis

Purpose – This study aims to determine how Tumblr is being used by libraries and special collections/archives in the USA through quantitative analysis. Design/methodology/approach – Data on library

Influence and Sentiment Homophily on Twitter

An empirical study that combines existing Graph Clustering and Sentiment Analysis techniques for reasoning about Sentiment dynamics at cluster level and analyzing the role of Social Influence on Sentiment contagion, based on a large dataset extracted from Twitter during the 2014 FIFA World Cup is presented.

Leveraging Blogging Activity on Tumblr to Infer Demographics and Interests of Users for Advertising Purposes

This paper proposes a novel semi-supervised neural language model for categorization of Tumblr content, trained on a large-scale data set consisting of 6.8 billion user posts, with a very limited amount of categorized keywords, and was shown to have superior performance over the baseline models.



What is Twitter, a social network or a news media?

This work is the first quantitative study on the entire Twittersphere and information diffusion on it and finds a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.

Who says what to whom on twitter

A striking concentration of attention is found on Twitter, in that roughly 50% of URLs consumed are generated by just 20K elite users, where the media produces the most information, but celebrities are the most followed.

TwitterRank: finding topic-sensitive influential twitterers

Experimental results show that TwitterRank outperforms the one Twitter currently uses and other related algorithms, including the original PageRank and Topic-sensitive PageRank, which is proposed to measure the influence of users in Twitter.

Why we twitter: understanding microblogging usage and communities

It is found that people use microblogging to talk about their daily activities and to seek or share information and the user intentions associated at a community level are analyzed to show how users with similar intentions connect with each other.

"I need to try this"?: a statistical overview of pinterest

It is found that being female means more repins, but fewer followers, and that four verbs set Pinterest apart from Twitter: use, look, want and need.

Finding patterns in blog shapes and blog evolution

This work provides several sets of blog and post features that can help distinguish blogs, like ‘humor’ versus ‘conservative’ blogs, and proposes to use PCA to reduce dimensionality, so that the resulting clouds of points can be visualize.

Everyone's an influencer: quantifying influence on twitter

It is concluded that word-of-mouth diffusion can only be harnessed reliably by targeting large numbers of potential influencers, thereby capturing average effects and that predictions of which particular user or URL will generate large cascades are relatively unreliable.

The Pin-Bang Theory: Discovering The Pinterest World

This analysis characterized Pinterest on the basis of large scale crawls of 3.3 million user profiles, and 58.8 million pins, and demonstrated how Pinterest is a potential venue for copyright infringement, by showing that almost half of the images shared on Pinterest go uncredited.

Measurement and analysis of online social networks

This paper examines data gathered from four popular online social networks: Flickr, YouTube, LiveJournal, and Orkut, and reports that the indegree of user nodes tends to match the outdegree; the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree node at the fringes of the network.

Identifying the influential bloggers in a community

The challenges of identifying influential bloggers are discussed, what constitutes influential bloggers is investigated, a preliminary model attempting to quantify an influential blogger is presented, and the way for building a robust model that allows for finding various types of the influentials is paved.