• Publications
  • Influence
Thumbs up? Sentiment Classification using Machine Learning Techniques
TLDR
This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts
TLDR
A novel machine-learning method is proposed that applies text-categorization techniques to just the subjective portions of the document, which greatly facilitates incorporation of cross-sentence contextual constraints.
Opinion Mining and Sentiment Analysis
TLDR
This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.
Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales
TLDR
A meta-algorithm is applied, based on a metric labeling formulation of the rating-inference problem, that alters a given n-ary classifier's output in an explicit attempt to ensure that similar items receive similar labels.
Measures of Distributional Similarity
TLDR
This work presents an empirical comparison of a broad range of measures; a classification of similarity functions based on the information that they incorporate; and the introduction of a novel function that is superior at evaluating potential proxy distributions.
Distributional Clustering of English Words
TLDR
Deterministic annealing is used to find lowest distortion sets of clusters: as the annealed parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data.
Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions
TLDR
It is shown that persuasive arguments are characterized by interesting patterns of interaction dynamics, such as participant entry-order and degree of back-and-forth exchange, and that stylistic choices in how the opinion is expressed carry predictive power.
Get out the vote: Determining support or opposition from Congressional floor-debate transcripts
TLDR
It is found that the incorporation of sources of information regarding relationships between discourse segments, such as whether a given utterance indicates agreement with the opinion expressed by another, yields substantial improvements over classifying speeches in isolation.
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
TLDR
An effective knowledge-lean method for learning content models from unannotated documents is presented, utilizing a novel adaptation of algorithms for Hidden Markov Models and applied to two complementary tasks: information ordering and extractive summarization.
Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
TLDR
This work applies multiple-sequence alignment to sentences gathered from unannotated comparable corpora: it learns a set of paraphrasing patterns represented by word lattice pairs and automatically determines how to apply these patterns to rewrite new sentences.
...
1
2
3
4
5
...