Corpus ID: 16868223

Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews

@article{Mesnil2015EnsembleOG,
  title={Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews},
  author={Gr{\'e}goire Mesnil and Tomas Mikolov and Marc'Aurelio Ranzato and Yoshua Bengio},
  journal={arXiv: Computation and Language},
  year={2015}
}
Sentiment analysis is a common task in natural language processing that aims to detect polarity of a text document (typically a consumer review. [...] Key Method We show how to use for this task the standard generative lan- guage models, which are slightly complementary to the state of the art techniques. We achieve strong results on a well-known dataset of IMDB movie reviews. Our results are easily reproducible, as we publish also the code needed to repeat the experiments. This should simplify further advance…Expand
Adding CNNs to the Mix: Stacking models for sentiment classification
TLDR
This work considered the best performing sentiment analysis model which is a ensemble of NB-SVM, Paragraph2Vec and RNN, and added CNN into this stacking model and showed that the ensemble model perform better than the existing one. Expand
Learning to Generate Reviews and Discovering Sentiment
TLDR
The properties of byte-level recurrent language models are explored and a single unit which performs sentiment analysis is found which achieves state of the art on the binary subset of the Stanford Sentiment Treebank. Expand
Sentiment Analysis on Movie Scripts and Reviews
TLDR
This study offers a different approach based on the emotionally analyzed concatenation of movie script and their respective reviews, and the results indicate that the proposed combination of features achieves a notable performance, similar to conventional approaches. Expand
Sentiment Analysis on Movie Review Using Deep Learning RNN Method
TLDR
The deep learning-based classification algorithm RNN was applied, the performance of the classifier based on the pre-process of data was measured, and it obtained 94.61% accuracy. Expand
An Ensemble Method with Sentiment Features and Clustering Support
TLDR
Convolutional Neural Network and Long Short Term Memory were utilized to learn sentiment-specific features in a freezing scheme and this scenario provides a novel and efficient way for integrating advantages of deep learning models. Expand
Reliable Baselines for Sentiment Analysis in Resource-Limited Languages: The Serbian Movie Review Dataset
TLDR
A dataset balancing algorithm is presented that minimizes the sample selection bias by eliminating irrelevant systematic differences between the sentiment classes and it is proved its superiority over the random sampling method. Expand
Sentiment Analysis Using Averaged Weighted Word Vector Features
TLDR
This work develops two methods that combine different types of word vectors to learn and estimate polarity of reviews and ensemble the techniques with each other and existing methods, and makes a comparison with the approaches in the literature. Expand
Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews
TLDR
The architecture of the recently proposed Paragraph Vector is modified, allowing it to learn document vectors by predicting not only words, but n-gram features as well, which is able to capture both semantics and word order in documents while keeping the expressive power of learned vectors. Expand
Distributed Text Representation with Weighting Scheme Guidance for Sentiment Analysis
TLDR
This paper takes advantages of techniques in both lines of researches, where weighting schemes are introduced into the neural models to guide neural networks to focus on those important words, and discovers that better features can be extracted for sentiment analysis. Expand
Reinforcing the Topic of Embeddings with Theta Pure Dependence for Text Classification
TLDR
This work proposes to incorporate Theta Pure Dependence (TPD) into the Paragraph Vector method to reinforce topical and sentimental information and outperforms the state-of-the-art performance on text classification tasks. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 11 REFERENCES
Learning Word Vectors for Sentiment Analysis
TLDR
This work presents a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semantic term--document information as well as rich sentiment content, and finds it out-performs several previously introduced methods for sentiment classification. Expand
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
TLDR
A novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions that outperform other state-of-the-art approaches on commonly used datasets, without using any pre-defined sentiment lexica or polarity shifting rules. Expand
Baselines and Bigrams: Simple, Good Sentiment and Topic Classification
TLDR
It is shown that the inclusion of word bigram features gives consistent gains on sentiment analysis tasks, and a simple but novel SVM variant using NB log-count ratios as feature values consistently performs well across tasks and datasets. Expand
Opinion Mining and Sentiment Analysis
TLDR
This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. Expand
Distributed Representations of Sentences and Documents
TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models. Expand
Statistical Language Models Based on Neural Networks
TLDR
Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup. Expand
Improved backing-off for M-gram language modeling
  • Reinhard Kneser, H. Ney
  • Computer Science
  • 1995 International Conference on Acoustics, Speech, and Signal Processing
  • 1995
TLDR
This paper proposes to use distributions which are especially optimized for the task of back-off, which are quite different from the probability distributions that are usually used for backing-off. Expand
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Expand
On the difficulty of training recurrent neural networks
TLDR
This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions. Expand
SRILM - an extensible language modeling toolkit
TLDR
The functionality of the SRILM toolkit is summarized and its design and implementation is discussed, highlighting ease of rapid prototyping, reusability, and combinability of tools. Expand
...
1
2
...