Corpus ID: 64328899

Twitter sentiment for 15 European languages

  title={Twitter sentiment for 15 European languages},
  author={I. Mozetic and Miha Grcar and Jasmina Smailovic},
The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages. The data can be used to train and evaluate Twitter sentiment classifiers, to compute annotator agreement, or to study the differences between language usage on Twitter. The data analysis is described in the following papers: I. Mozetic, M. Grcar, J. Smailovic. Multilingual Twitter sentiment classification: The role… Expand

Topics from this paper

Cross-lingual sentiment transfer with limited resources
Over a Decade of Social Opinion Mining
A Global Analysis of Emoji Usage
Sentiment-based Candidate Selection for NMT