• Corpus ID: 7360591

A Computational Analysis of Collective Discourse

  title={A Computational Analysis of Collective Discourse},
  author={Vahed Qazvinian and Dragomir R. Radev},
This paper is focused on the computational analysis of collective discourse, a collective behavior seen in nonexpert content contributions in online social media. We collect and analyze a wide range of real-world collective discourse datasets from movie user reviews to microblogs and news headlines to scientic citations. We show that all these datasets exhibit diversity of perspective, a property seen in other collective systems and a criterion in wise crowds. Our experiments also conrm that… 

Figures and Tables from this paper

DClaims: A Censorship Resistant Web Annotations System

The web plays a critical role in informing modern democracies and it is essential to allow people to have access to reliable sources of information – to the point of the identification and classification of low-quality information.

Wise Crowd Content Assessment and Educational Rubrics

This work compares a main ideas rubric used in a successful writing intervention study to a highly reliable wise-crowd content assessment method developed to evaluate machine-generated summaries.

Evaluation of semantic dependencies in a conceptual co-occurrence network of a medical vocabulary

To enable creation of new adaptive personalized health support tools, an evaluation of semantic dependencies in a conceptual co-occurrence network covering a set of concepts of a medical vocabulary is carried out.

Using Collective Discourse to Generate Surveys of Scientific Paradigms

A selection of photographs from the 2016/17 USGS report on quantitative hazard assessments of earthquake-triggered landsliding and liquefaction in the period of May 21 to 29, respectively.

Enabling personalized healthcare by analyzing semantic dependencies in a conceptual co-occurrence network based on a medical vocabulary

An evaluation of semantic dependencies in a conceptual co-occurrence network covering a set of concepts of a medical vocabulary is carried out to enable creation of new adaptive personalized health support tools.



Learning From Collective Human Behavior to Introduce Diversity in Lexical Choice

Using extensive analysis, this work proposes a novel paradigm for designing summary generation systems that reflect the diversity of perspectives seen in reallife collective summarization, and presents a ranker that employs distributional similarities to build a network of words, and captures the Diversity of perspectives by detecting communities in this network.

Power of the Few vs . Wisdom of the Crowd : Wikipedia and the Rise of the Bourgeoisie

Although Wikipedia was driven by the influence of “elite” users early on, more recently there has been a dramatic shift in workload to the “common” user, and this is shown in del.icio.us, a very different type of social collaborative knowledge system.

Rumor has it: Identifying Misinformation in Microblogs

This paper addresses the problem of rumor detection in microblogs and explores the effectiveness of 3 categories of features: content- based, network-based, and microblog-specific memes for correctly identifying rumors, and believes that its dataset is the first large-scale dataset on rumor detection.

Examining the consensus between human summaries: initial experiments with factoid analysis

We present a new approach to summary evaluation which combines two novel aspects, namely (a) content comparison between gold standard summary and system summary via factoids, a pseudo-semantic

Evaluating Information Content by Factoid Analysis: Human annotation and stability

It is shown that factoid annotation is highly reproducible, introduced a weighted factoid score, estimate how many summaries are required for stable system rankings, and show that the factoid scores cannot be sufficiently approximated by unigrams and the DUC information overlap measure.

Assessing Agreement on Classification Tasks: The Kappa Statistic

What is wrong with reliability measures as they are currently used for discourse and dialogue work in computational linguistics and cognitive science, and it is argued that the field would be better off as a field adopting techniques from content analysis.

Harnessing the wisdom of crowds in wikipedia: quality through coordination

Examination of how the number of editors in Wikipedia and the coordination methods they use affect article quality demonstrated the critical importance of coordination in effectively harnessing the "wisdom of the crowd" in online production environments.

Evaluating Content Selection in Summarization: The Pyramid Method

It is argued that the method presented is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference.

The Structure and Function of Complex Networks

Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

Collecting Highly Parallel Data for Paraphrase Evaluation

A novel data collection framework is presented that produces highly parallel text data relatively inexpensively and on a large scale that allows for simple n-gram comparisons to measure both the semantic adequacy and lexical dissimilarity of paraphrase candidates.