• Corpus ID: 204910974

Error Analysis in a Hate Speech Detection Task: The Case of HaSpeeDe-TW at EVALITA 2018

@inproceedings{Francesconi2019ErrorAI,
  title={Error Analysis in a Hate Speech Detection Task: The Case of HaSpeeDe-TW at EVALITA 2018},
  author={Chiara Francesconi and Cristina Bosco and Fabio Poletto and Manuela Sanguinetti},
  booktitle={CLiC-it},
  year={2019}
}
Taking as a case study the Hate Speech Detection task at EVALITA 2018, the paper discusses the distribution and typology of the errors made by the five bestscoring systems. The focus is on the subtask where Twitter data was used both for training and testing (HaSpeeDe-TW). In order to highlight the complexity of hate speech and the reasons beyond the failures in its automatic detection, the annotation provided for the task is enriched with orthogonal categories annotated in the original… 

Tables from this paper

HaSpeeDe 2 @ EVALITA2020: Overview of the EVALITA 2020 Hate Speech Detection Task

The Hate Speech Detection (HaSpeeDe 2) task is the second edition of a shared task on the detection of hateful content in Italian Twitter messages. HaSpeeDe 2 is composed of a Main task (hate speech

Resources and benchmark corpora for hate speech detection: a systematic review

This review systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors, to highlight a heterogeneous, growing landscape.

TheNorth @ HaSpeeDe 2: BERT-based Language Model Fine-tuning for Italian Hate Speech Detection (short paper)

The systems that were submitted by the team “TheNorth” for the HaSpeeDe 2 shared task organised within EVALITA 2020 fine-tuned BERT-based models and evaluated both multilingual and Italian language models trained with the data provided and additional data.

CHILab @ HaSpeeDe 2: Enhancing Hate Speech Detection with Part-of-Speech Tagging (short paper)

Two neural network systems used for Hate Speech Detection tasks that make use not only of the pre-processed text but also of its Partof-Speech (PoS) tag are described.

"Be nice to your wife! The restaurants are closed": Can Gender Stereotype Detection Improve Sexism Classification?

Results show that sexism classification can beitively benefited from gender stereotype detection, and a new method for data augmentation based on sentence similarity with multilingual external datasets is proposed.

Overview of the CLEF 2022 JOKER Task 3: Pun Translation from English into French

This paper provides an overview of the CLEF 2022 JOKER track’s Pilot Task 3, where the goal is to translate entire phrases containing wordplay (particularly puns), and describes the data collection, the task setup, the evaluation procedure, and the participants’ results.

References

SHOWING 1-10 OF 17 REFERENCES

Overview of the EVALITA 2018 Hate Speech Detection Task

The Hate Speech Detection task is a shared task on Italian social media for the detection of hateful content, and it has been proposed for the first time at EVALITA 2018, providing two datasets from two different online social platforms differently featured from the linguistic and communicative point of view.

Comparing Different Supervised Approaches to Hate Speech Detection

This paper reports on the systems the InriaFBK Team submitted to the EVALITA 2018-Shared Task on Hate Speech Detection in Italian Twitter and Facebook posts (HaSpeeDe), based on three separate models: a model using a recurrent layer, an ngram-based neural network and a LinearSVC.

RuG @ EVALITA 2018: Hate Speech Detection In Italian Social Media

The systems the RuG Team developed in the context of the Hate Speech Detection Task in Italian Social Media at EVALITA 2018 are described, and the best macro-F1 score in all subtasks was obtained by a Linear SVM, using hate-rich embeddings.

SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter

The paper describes the organization of the SemEval 2019 Task 5 about the detection of hate speech against immigrants and women in Spanish and English messages extracted from Twitter, and provides an analysis and discussion about the participant systems and the results they achieved in both subtasks.

Overview of the Task on Automatic Misogyny Identification at IberEval 2018

The datasets, the evaluation methodology, an overview of the proposed systems and the obtained results are presented, some conclusions are drawn and the future work is discussed.

A Survey on Automatic Detection of Hate Speech in Text

This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used, and provides a unifying definition of hate speech.

A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection

This work presents a Hindi-English code-mixed dataset consisting of tweets posted online on Twitter and proposes a supervised classification system for detecting hate speech in the text using various character level, word level, and lexicon based features.

A Survey on Hate Speech Detection using Natural Language Processing

A survey on hate speech detection describes key areas that have been explored to automatically recognize these types of utterances using natural language processing and discusses limits of those approaches.

SemEval-2018 Task 1: Affect in Tweets

This work presents the SemEval-2018 Task 1: Affect in Tweets, which includes an array of subtasks on inferring the affectual state of a person from their tweet, with a focus on the techniques and resources that are particularly useful.

spMMMP at GermEval 2018 shared task : classification of offensive content in tweets using convolutional neural networks and gated recurrent units

Two different systems for classifying offensive language in micro-blog messages from Twitter using an ensemble of convolutional neural networks and a combination of a CNN and a gated recurrent unit together with a transfer-learning approach based on pretraining with a large, automatically translated dataset are proposed.