FinnSentiment - A Finnish Social Media Corpus for Sentiment Polarity Annotation
@article{Lindn2020FinnSentimentA, title={FinnSentiment - A Finnish Social Media Corpus for Sentiment Polarity Annotation}, author={Krister Lind{\'e}n and T. Jauhiainen and Sam Hardwick}, journal={ArXiv}, year={2020}, volume={abs/2012.02613} }
Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for…
Figures and Tables from this paper
4 Citations
Evaluating morphological typology in zero-shot cross-lingual transfer
- LinguisticsACL
- 2021
This paper addresses what effects morphological typology has on zero-shot cross-lingual transfer for two tasks: Part-of-speech tagging and sentiment analysis and finds that transfer to another morphological type generally implies a higher loss than transfer toanother language with the same morphologicalTypology.
The Current State of Finnish NLP
- Computer ScienceIWCLUL
- 2021
This paper surveys recent papers focusing on Finnish NLP related to many different subcategories of NLP such as parsing, generation, semantics and speech.
Sentiment analysis of depression related discussions in the Suomi24 discussion forum
- GeologyInformaatiotutkimus
- 2022
IWCLUL 2021 The Seventh International Workshop on Computational Linguistics of Uralic Languages
- Linguistics
- 2021
In the first decade of the 21th century, an atlas of Udmurt dialects was prepared for publication. Although hundreds of maps and legends were completed, due to no hope for publication, the project…
References
SHOWING 1-10 OF 111 REFERENCES
RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian
- Computer ScienceCOLING
- 2018
RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages are presented.
Annotating evaluative sentences for sentiment analysis: a dataset for Norwegian
- Computer ScienceNODALIDA
- 2019
This paper documents the creation of a large-scale dataset of evaluative sentences – i.e. both subjective and objective sentences that are found to be sentiment-bearing – based on mixed-domain…
The Challenges of Multi-dimensional Sentiment Analysis Across Languages
- Computer SciencePEOPLES@COLING
- 2016
This paper outlines a pilot study on multi-dimensional and multilingual sentiment analysis of social media content. We use parallel corpora of movie subtitles as a proxy for colloquial language in…
An Annotated Corpus for Sentiment Analysis in Political News
- Computer ScienceSTIL
- 2015
A corpus of news texts in Brazilian Portuguese, segmented in paragraphs, and marked up by a group of four annotators, which built a gold standard, where paragraphs are classified according to the opinion of the majority of annotators.
Gold-standard for Topic-specific Sentiment Analysis of Economic Texts
- Computer ScienceLREC
- 2014
The annotations of 297 documents and over 9000 sentences can be used for research purposes when developing methods for detecting topic-wise sentiment in financial text and are evaluated using a number of inter-annotator agreement metrics.
Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates
- Computer ScienceACL
- 2011
It is concluded that the challenge in performing opinion mining in such type of content is correctly identifying the positive opinions, because they are much less frequent than negative opinions and they are particularly exposed to verbal irony.
Datasets for Aspect-Based Sentiment Analysis in French
- Computer ScienceLREC
- 2016
Two datasets for the development and testing of ABSA systems for French which comprise user reviews annotated with relevant entities, aspects and polarity values are described.
An annotated corpus for Turkish sentiment analysis at sentence level
- Computer Science2017 International Artificial Intelligence and Data Processing Symposium (IDAP)
- 2017
A Turkish sentiment corpus, which is comprised of user reviews and is annotated semi-automatically, is constructed and this dataset is made easy to use for Java applications by creating JSON data.
Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries
- Computer ScienceJ. Univers. Comput. Sci.
- 2015
It is shown how the task can be made tractable by focusing on one of the many aspects of sentiment: sentiment as it is recorded in behaviour reports of people and their interactions.
Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts
- Computer ScienceLang. Resour. Evaluation
- 2016
We present the development and evaluation of a semantic analysis task that lies at the intersection of two very trendy lines of research in contemporary computational linguistics: (1) sentiment…