SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

@inproceedings{Zampieri2020SemEval2020T1,
  title={SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)},
  author={Marcos Zampieri and Preslav Nakov and Sara Rosenthal and Pepa Atanasova and Georgi Karadzhov and Hamdy Mubarak and Leon Derczynski and Zeses Pitenis and cCaugri cColtekin},
  booktitle={SEMEVAL},
  year={2020}
}
We present the results and the main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval-2020). The task included three subtasks corresponding to the hierarchical taxonomy of the OLID schema from OffensEval-2019, and it was offered in five languages: Arabic, Danish, English, Greek, and Turkish. OffensEval-2020 was one of the most popular tasks at SemEval-2020, attracting a large number of participants across all subtasks and languages: a… 
Duluth at SemEval-2020 Task 12: Offensive Tweet Identification in English with Logistic Regression
TLDR
It is hypothesize that the extremely high accuracy (>$ 90%) of the top ranked systems may reflect methods that learn the training data very well but may not generalize to the task of identifying offensive language in English.
LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification
TLDR
This paper presents a system entitled ‘LIIR’ for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2) and proposes a cross-lingual augmentation approach in order to enrich training data and uses Multilingual BERT to obtain sentence representations.
KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT
TLDR
This research presents the team KEIS@JUST participation at SemEval-2020 Task 12 which represents shared task on multilingual offensive language, a transfer learning from BERT beside the recurrent neural networks such as Bi-LSTM and Bi-GRU followed by a global average pooling layer.
GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for Offensive Language Detection
TLDR
An ablation study reveals that domain tuning considerably improves the classification performance of the BERT model, and error analysis shows common misclassification errors made by the model and outlines research directions for future.
WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets
TLDR
The system submitted by WideBot AI Lab for the shared task ranked 10th out of 52 participants with Macro-F1 86.9% on the golden dataset under CodaLab username “yasserotiefy” and a neural network approach was introduced that enhanced the predictive ability of the system that includes CNN, highway network, Bi-LSTM, and attention layers.
problemConquero at SemEval-2020 Task 12: Transformer and Soft Label-based Approaches
TLDR
Various systems submitted by the team problemConquero for SemEval-2020 Shared Task 12 “Multilingual Offensive Language Identification in Social Media”, including transformer-based approaches and a soft label-based approach are presented.
NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer
TLDR
A new metric, Translation Embedding Distance, is proposed to measure the transferability of instances for cross-lingual data selection and various preprocessing steps tailored for social media text along with methods to fine-tune the pre-trained multilingual BERT (mBERT) for offensive language identification.
Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language
TLDR
This work introduces a cross-lingual inductive approach to identify the offensive language in tweets using the contextual word embedding XLM-RoBERTa (XLM-R), and shows that this model works competitively in a zero-shot learning environment, and is extensible to other languages.
ANDES at SemEval-2020 Task 12: A Jointly-trained BERT Multilingual Model for Offensive Language Detection
TLDR
A single model was jointly-trained by fine-tuning Multilingual BERT to tackle the task across all the proposed languages, with a performance close to top-performing systems in spite of sharing the same parameters across all languages.
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media
TLDR
It is shown that combining CNN with BERT is better than using BERT on its own, and the importance of utilizing pre-trained language models for downstream tasks is emphasized.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 127 REFERENCES
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
TLDR
The results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval), based on a new dataset, contain over 14,000 English tweets, are presented.
Duluth at SemEval-2020 Task 12: Offensive Tweet Identification in English with Logistic Regression
TLDR
It is hypothesize that the extremely high accuracy (>$ 90%) of the top ranked systems may reflect methods that learn the training data very well but may not generalize to the task of identifying offensive language in English.
LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification
TLDR
This paper presents a system entitled ‘LIIR’ for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2) and proposes a cross-lingual augmentation approach in order to enrich training data and uses Multilingual BERT to obtain sentence representations.
KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT
TLDR
This research presents the team KEIS@JUST participation at SemEval-2020 Task 12 which represents shared task on multilingual offensive language, a transfer learning from BERT beside the recurrent neural networks such as Bi-LSTM and Bi-GRU followed by a global average pooling layer.
SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter
TLDR
The paper describes the organization of the SemEval 2019 Task 5 about the detection of hate speech against immigrants and women in Spanish and English messages extracted from Twitter, and provides an analysis and discussion about the participant systems and the results they achieved in both subtasks.
GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for Offensive Language Detection
TLDR
An ablation study reveals that domain tuning considerably improves the classification performance of the BERT model, and error analysis shows common misclassification errors made by the model and outlines research directions for future.
WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets
TLDR
The system submitted by WideBot AI Lab for the shared task ranked 10th out of 52 participants with Macro-F1 86.9% on the golden dataset under CodaLab username “yasserotiefy” and a neural network approach was introduced that enhanced the predictive ability of the system that includes CNN, highway network, Bi-LSTM, and attention layers.
problemConquero at SemEval-2020 Task 12: Transformer and Soft Label-based Approaches
TLDR
Various systems submitted by the team problemConquero for SemEval-2020 Shared Task 12 “Multilingual Offensive Language Identification in Social Media”, including transformer-based approaches and a soft label-based approach are presented.
NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer
TLDR
A new metric, Translation Embedding Distance, is proposed to measure the transferability of instances for cross-lingual data selection and various preprocessing steps tailored for social media text along with methods to fine-tune the pre-trained multilingual BERT (mBERT) for offensive language identification.
Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language
TLDR
This work introduces a cross-lingual inductive approach to identify the offensive language in tweets using the contextual word embedding XLM-RoBERTa (XLM-R), and shows that this model works competitively in a zero-shot learning environment, and is extensible to other languages.
...
1
2
3
4
5
...