• Corpus ID: 152282879

A Comparison of Techniques for Sentiment Classification of Film Reviews

  title={A Comparison of Techniques for Sentiment Classification of Film Reviews},
  author={Milan Gritta},
We undertake the task of comparing lexicon-based sentiment classification of film reviews with machine learning approaches. We look at existing methodologies and attempt to emulate and improve on them using a 'given' lexicon and a bag-of-words approach. We also utilise syntactical information such as part-of-speech and dependency relations. We will show that a simple lexicon-based classification achieves good results however machine learning techniques prove to be the superior tool. We also… 

Tables from this paper

Sentiment Analysis using Naive Bayes Classifier and Information Gain Feature Selection over Twitter

The purpose of this study is to improve the accuracy of the Naïve Bayes algorithm in classifying documents along with Information Gain methodology and to evaluate the sentiments of opinion mining method.



Thumbs up? Sentiment Classification using Machine Learning Techniques

This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.

Harnessing WordNet Senses for Supervised Sentiment Classification

It is shown that even if a WSD engine disambiguates between a limited set of words in a document, a sentiment classifier still performs better than what it does in absence of sense annotation.

Dependency Forest for Sentiment Analysis

A forest-based approach that applies dependency forest to sentiment analysis and develops new algorithms for extracting features from dependency forest, which achieves state-of- the-art performance on the sentiment dataset.

Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

This paper empirically evaluates whether using an off-the-shelf anaphora resolution algorithm can improve the performance of a baseline opinion mining system, and presents an analysis based on two different anaphoras resolution systems.

Discourse Connectors for Latent Subjectivity in Sentiment Analysis

A new method for injecting linguistic knowledge into latent variable subjectivity modeling, using discourse connectors, and describes a simple heuristic for automatically identifying connectors when no predefined list is available.

An empirical study of the naive Bayes classifier

This work analyzes the impact of the distribution entropy on the classificationerror, showing that low-entropy featuredistributions yield good performance of naive Bayes and demonstrates that naive Baye works well for certain nearlyfunctional featuredependencies.

The WEKA data mining software: an update

This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

Association for Computational Linguistics, Stroudsburg, PA, USA

  • Proceedings of the Conference on Em-
  • 2011