Corpus ID: 513094

Bilingual Experiments with an Arabic-English Corpus for Opinion Mining

  title={Bilingual Experiments with an Arabic-English Corpus for Opinion Mining},
  author={Mohammed Rushdi-Saleh and Mar{\'i}a Teresa Mart{\'i}n-Valdivia and Luis Alfonso Ure{\~n}a L{\'o}pez and Jos{\'e} Manuel Perea Ortega},
Recently, Opinion Mining (OM) is receiving more attention due to the abundance of forums, blogs, ecommerce web sites, news reports and additional web sources where people tend to express their opinions. There are a number of works about Sentiment Analysis (SA) studying the task of identifying the polarity, whether the opinion expressed in a text is positive or negative about a given topic. However, most of research is focused on English texts and there are very few resources for other languages… 
Aspect extraction for sentiment analysis in Arabic dialect
A more fine-grained approach to Arabic opinion mining at the aspect level through experimentation with methods that have been used in English Aspect extraction and contributing a dataset that can be used for further research on Arabic dialect.
An extended analytical study of Arabic sentiments
This paper presents a comprehensive analysis of a relatively large dataset of Arabic comments collected from one of the most widely used social networks in the Arab world, Yahoo!-Maktoob, and shows that SVM outperforms NB and achieves a 64% accuracy level.
This work starts by testing on English texts that are collected from Amazon, then applies more than one machine learning on algorithms both (Arabic and English) and created a Sentiword Lexicon based on the Corpus that was gathered.
Using Enhanced Lexicon-Based Approaches for the Determination of Aspect Categories and Their Polarities in Arabic Reviews
This work considers two ABSA tasks: aspect category determination and aspect category polarity determination, and makes use of the publicly available human annotated Arabic dataset HAAD along with its baseline experiments conducted by HAAD providers.
Automatic Lexicon Construction for Arabic Sentiment Analysis
This work focuses on a less studied aspect of SA, which is lexicon-based SA for the Arabic language, and an Arabic SA tool is designed and implemented to effectively take advantage of the constructed lexicons.
Sentiment classification on arabic corpora. A preliminary cross-study
The study is carried out to investigate supervised sentiment classification in an Arabic context using two Arabic Corpora which are different in many aspects and three common classifiers known by their effectiveness, namely Naïve Bayes, Support Vector Machines and k-Nearest Neighbor.
The rise of social media (such as online web forums and social networking sites) has attracted interests to mining and analyzing opinions available on the web. The online opinion has become the
Enhancing the determination of aspect categories and their polarities in Arabic reviews using lexicon-based approaches
This work considers two ABSA tasks: aspect category determination and aspect category polarity determination, and makes use of the publicly available human annotated Arabic dataset (HAAD) along with its baseline experiments conducted by HAAD providers.
Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis
Foster the domain of Arabic ABSA, and provides a benchmark human annotated Arabic dataset (HAAD), which consists of books reviews in Arabic which have been annotated by humans with aspect terms and their polarities.
Some methods to address the problem of unbalanced sentiment classification in an arabic context
The study is carried out to address the problem of unbalanced data sets in supervised sentiment classification in an Arabic context with three different methods to under-sample the majority class documents and shows that Naïve Bayes is sensitive to data set size, the more the authors reduce the data the more the results degrade.


Sentiment Analysis of French Movie Reviews
A supervised classification of French movie reviews where sentiment analysis is based on some shallow linguistic features such as POS tagging, chunking and simple negation forms is presented, and results showed that shallow language features has significantly improved the classification performance with respect to the bag of words baseline.
SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining
SENTIWORDNET is a lexical resource in which each WORDNET synset is associated to three numerical scores Obj, Pos and Neg, describing how objective, positive, and negative the terms contained in the synset are.
Using SentiWordNet for multilingual sentiment analysis
  • K. Denecke
  • Computer Science
    2008 IEEE 24th International Conference on Data Engineering Workshop
  • 2008
The results show that working with standard technology and existing sentiment analysis approaches is a viable approach to sentiment analysis within a multilingual framework.
Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums
Stylistic features significantly enhanced performance across all testbeds while EWGA also outperformed other feature selection methods, indicating the utility of these features and techniques for document-level classification of sentiments.
Towards Sentiment Analysis of Financial Texts in Croatian
A statistically significant correspondence is shown between the overall market trend on the Zagreb Stock Exchange and the number of positively and negatively accented articles within periods of trend and between the general sentiment of articles and thenumber of polarity phrases within those articles.
Multi-lingual Sentiment Analysis of Financial News Streams
Department of Computing, University of Surrey Guildford, Surrey, GU2 7XH, United Kingdom E-mail: Encouraged by the feasibility demonstration that a relatively low-cost grid
Machine learning in automated text categorization
This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Machine learning
Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
EmotiBlog: an Annotation Scheme for Emotion Detection and Analysis in Non-traditional Textual Genres
Mesitylene cannot be separated from 4-ethyl toluene by distillation because of the proximity of their boiling points, but effective agents are isopropyl palmitate, triacetin and methyl salicylate.
The Nature of Statistical Learning Theory
  • V. Vapnik
  • Computer Science, Mathematics
    Statistics for Engineering and Information Science
  • 2000
Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing