• Publications
  • Influence
Sentiment Analysis of Twitter Data
We examine sentiment analysis on Twitter data. The contributions of this paper are: (1) We introduce POS-specific prior polarity features. (2) We explore the use of a tree kernel to obviate the needExpand
Evaluating Content Selection in Summarization: The Pyramid Method
It is argued that the method presented is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference. Expand
The Pyramid Method: Incorporating human content selection variation in summarization evaluation
This article proposes a method for analysis of multiple human abstracts into semantic content units, which serves as the basis for an evaluation method that incorporates the observed variation and is predictive of different equally informative summaries. Expand
The Manually Annotated Sub-Corpus: A Community Resource for and by the People
The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus, the first large-scale, open, community-based effort to create much needed language resources for NLP. Expand
Quantitative and Qualitative Evaluation of Darpa Communicator Spoken Dialogue Systems
It is shown that performance models derived via using the standard metrics can account for 37% of the variance in user satisfaction, and that the addition of DATE metrics improved the models by an absolute 5%. Expand
Discourse Segmentation by Human and Automated Means
The first part of this paper presents a method for empirically validating multitutterance units referred to as discourse segments, and reports highly significant results of segmentations performed by naive subjects, where a commonsense notion of speaker intention is the segmentation criterion. Expand
MASC: the Manually Annotated Sub-Corpus of American English
A Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. Expand
Applying the Pyramid Method in DUC 2005
It is found that a modified pyramid score gave good results and would simplify peer annotation in the future and high score correlations between sets from different annotators, and good interannotator agreement, indicate that participants can perform annotation reliably. Expand
The Benefits of a Model of Annotation
In a case study of word sense annotation, conventional methods for evaluating labels from trained annotators are contrasted with a probabilistic annotation model applied to crowdsourced data. Expand
Computing Reliability for Coreference Annotation
The solution I present accommodates a wide range of coding choices for the annotator, while preserving the same units across codings, and permits a straightforward application of reliability measurement in coreference annotation. Expand