• Corpus ID: 227305851

FinnSentiment - A Finnish Social Media Corpus for Sentiment Polarity Annotation

  title={FinnSentiment - A Finnish Social Media Corpus for Sentiment Polarity Annotation},
  author={Krister Lind{\'e}n and T. Jauhiainen and Sam Hardwick},
Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for… 

Evaluating morphological typology in zero-shot cross-lingual transfer

This paper addresses what effects morphological typology has on zero-shot cross-lingual transfer for two tasks: Part-of-speech tagging and sentiment analysis and finds that transfer to another morphological type generally implies a higher loss than transfer toanother language with the same morphologicalTypology.

The Current State of Finnish NLP

This paper surveys recent papers focusing on Finnish NLP related to many different subcategories of NLP such as parsing, generation, semantics and speech.

IWCLUL 2021 The Seventh International Workshop on Computational Linguistics of Uralic Languages

  • Linguistics
  • 2021
In the first decade of the 21th century, an atlas of Udmurt dialects was prepared for publication. Although hundreds of maps and legends were completed, due to no hope for publication, the project



RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian

RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages are presented.

Annotating evaluative sentences for sentiment analysis: a dataset for Norwegian

This paper documents the creation of a large-scale dataset of evaluative sentences – i.e. both subjective and objective sentences that are found to be sentiment-bearing – based on mixed-domain

The Challenges of Multi-dimensional Sentiment Analysis Across Languages

This paper outlines a pilot study on multi-dimensional and multilingual sentiment analysis of social media content. We use parallel corpora of movie subtitles as a proxy for colloquial language in

An Annotated Corpus for Sentiment Analysis in Political News

A corpus of news texts in Brazilian Portuguese, segmented in paragraphs, and marked up by a group of four annotators, which built a gold standard, where paragraphs are classified according to the opinion of the majority of annotators.

Gold-standard for Topic-specific Sentiment Analysis of Economic Texts

The annotations of 297 documents and over 9000 sentences can be used for research purposes when developing methods for detecting topic-wise sentiment in financial text and are evaluated using a number of inter-annotator agreement metrics.

Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates

It is concluded that the challenge in performing opinion mining in such type of content is correctly identifying the positive opinions, because they are much less frequent than negative opinions and they are particularly exposed to verbal irony.

Datasets for Aspect-Based Sentiment Analysis in French

Two datasets for the development and testing of ABSA systems for French which comprise user reviews annotated with relevant entities, aspects and polarity values are described.

An annotated corpus for Turkish sentiment analysis at sentence level

A Turkish sentiment corpus, which is comprised of user reviews and is annotated semi-automatically, is constructed and this dataset is made easy to use for Java applications by creating JSON data.

Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries

It is shown how the task can be made tractable by focusing on one of the many aspects of sentiment: sentiment as it is recorded in behaviour reports of people and their interactions.

Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts

We present the development and evaluation of a semantic analysis task that lies at the intersection of two very trendy lines of research in contemporary computational linguistics: (1) sentiment