Learn More
With the widespread proliferation of social media tools such as Facebook and Twitter, the CSCW community has seen a growing interest among researchers to turn to records of social behavior from blogs, social media, and social networking sites, to study human social behavior. This nascent area, that has begun to be referred to in various research circles as(More)
Scholars have often relied on name initials to resolve name ambiguities in large-scale coauthorship network research. This approach bears the risk of incorrectly merging or splitting author identities. The use of initial-based disambiguation has been justified by the assumption that such errors would not affect research findings too much. This paper tests(More)
Applying the concept of triadic closure to coauthorship networks means that scholars are likely to publish a joint paper if they have previously coauthored with the same people. Prior research has identified moderate to high (20 to 40%) closure rates; suggesting this mechanism is a reasonable explanation for tie formation between future coauthors. We show(More)
In this paper, we evaluate the predictability of tweets associated with controversial versus non-controversial topics. As a first step, we crowd-sourced the scoring of a predefined set of topics on a Likert scale from non-controversial to controversial. Our feature set entails and goes beyond sentiment features, e.g., by leveraging empathic language and(More)
The popularity and availability of Twitter as a service and a data source have fueled the interest in sentiment analysis. Previous research has shed light on the challenges that contextualizing effects and linguistic complexities pose for the accurate sentiment classification of tweets. We test the effect of adding manually-annotated, corpus-based hashtags(More)
We present novel research at the intersection of review mining and impact assessment of issue-focused information products, namely documentary films. We develop and evaluate a theoretically grounded classification schema, related codebook, corpus annotation, and prediction model for detecting multiple types of impact that documentaries can have on(More)
This study investigates the evolution and structure of a national-scale co-publishing network in Korea from 1948 to 2011. We analyzed more than 700,000 papers published by approximately 415,000 authors for temporal changes in productivity and network properties with a yearly resolution. The resulting statistical properties were compared to findings from(More)
We extend classic review mining work by building a binary classifier that predicts whether a review of a documentary film was written by an expert or a layman with 90.70% accuracy (F1 score), and compare the characteristics of the predicted classes. A variety of standard lexical and syntactic features was used for this supervised learning task. Our results(More)
User-authored reviews offer a window into micro-level engagement with issue-focused documentary films, which is a critical yet insufficiently understood topic in media impact assessment. Based on our data, features, and supervised learning method, we find that ratings of non-documentary (feature film) reviews can be predicted with higher accuracy (73.67%,(More)