• Corpus ID: 233364926

IIITT@DravidianLangTech-EACL2021: Transfer Learning for Offensive Language Detection in Dravidian Languages

@inproceedings{Yasaswini2021IIITTDravidianLangTechEACL2021TL,
  title={IIITT@DravidianLangTech-EACL2021: Transfer Learning for Offensive Language Detection in Dravidian Languages},
  author={Konthala Yasaswini and Karthik Puranik and Adeep Hande and Ruba Priyadharshini and Sajeetha Thavareesan and Bharathi Raja Chakravarthi},
  booktitle={DRAVIDIANLANGTECH},
  year={2021}
}
This paper demonstrates our work for the shared task on Offensive Language Identification in Dravidian Languages-EACL 2021. Offensive language detection in the various social media platforms was identified previously. But with the increase in diversity of users, there is a need to identify the offensive language in multilingual posts that are largely code-mixed or written in a non-native script. We approach this challenge with various transfer learning-based models to classify a given post or… 

Tables from this paper

IIITK@DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and Kannada
This paper describes the IIITK team’s submissions to the offensive language identification, and troll memes classification shared tasks for Dravidian languages at DravidianLangTech 2021 workshop@EACL
Developing Successful Shared Tasks on Offensive Language Identification for Dravidian Languages
TLDR
An evaluation task is presented at FIRE 2020­ HASOC­DravidianCodeMix and DravidianLangTech at EACL 2021, designed to provide a framework for comparing different approaches to this problem.
OffTamil@DravideanLangTech-EASL2021: Offensive Language Identification in Tamil Text
TLDR
This study focused on offensive language identification on code-mixed low-resourced Dravidian language Tamil using four classifiers using chiˆ2 feature selection technique along with BoW and TF-IDF feature representation techniques using different combinations of n-grams.
UVCE-IIITT@DravidianLangTech-EACL2021: Tamil Troll Meme Classification: You need to Pay more Attention
TLDR
This work presents an ingenious model consisting of transformer-transformer architecture that tries to attain state of the art by using attention as its main component of troll and non-troll Tamil memes.
Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada
TLDR
A shared task on offensive language detection in Dravidian languages is created and an overview of the methods and the results of the competing systems are presented.
DLRG@DravidianLangTech-EACL2021: Transformer based approachfor Offensive Language Identification on Code-Mixed Tamil
TLDR
The transformer based language model is applied to analyse the sentiment on Tanglish tweets, which is a combination of Tamil and English and it is shown that an F 1 score of 64% was achieved in detecting the hate speech in code-mixed Tamil-English tweets using bidirectional trans- former model.
Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text
TLDR
This research paper employs two Transformer-based prototypes which successfully stood in the top 8 for all the tasks of the HASOC - DravidianCodeMix FIRE 2021 shared task and introduces two inventive methods for detecting offensive comments/posts in Tamil and Malayalam.
JudithJeyafreedaAndrew@DravidianLangTech-EACL2021:Offensive language detection for Dravidian Code-mixed YouTube comments
  • J. Andrew
  • Computer Science
    DRAVIDIANLANGTECH
  • 2021
TLDR
The task of offensive language detection on YouTube comments from the Dravidian lan- guages of Tamil, Malayalam and Kannada are seen upon as a mutliclass classification prob- lem after being subjected to language pre-processing.
BPHC@DravidianLangTech-ACL2022-A comparative analysis of classical and pre-trained models for troll meme classification in Tamil
TLDR
This paper aims to classify troll meme captions in Tamil-English code-mixed form with a weighted F1 score of 0.74 through MuRIL pretrained model and compares the performances of 11 different classification algorithms using Accuracy and F1- Score.
IIITK@LT-EDI-EACL2021: Hope Speech Detection for Equality, Diversity, and Inclusion in Tamil , Malayalam and English
This paper describes the IIITK’s team submissions to the hope speech detection for equality, diversity and inclusion in Dravidian languages shared task organized by LT-EDI 2021 workshop@EACL 2021.
...
...

References

SHOWING 1-10 OF 49 REFERENCES
Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada
TLDR
A shared task on offensive language detection in Dravidian languages is created and an overview of the methods and the results of the competing systems are presented.
CUSATNLP@HASOC-Dravidian-CodeMix-FIRE2020: Identifying Offensive Language from ManglishTweets
TLDR
An embedding model-based classifier identifies offensive and not offensive comments in the authors' approach and was applied in the Manglish dataset provided along with the sub-track HASOC Offensive Language Identification- DravidianCodeMix.
ALT Submission for OSACT Shared Task on Offensive Language Detection
TLDR
For offensive language detection, a system combination of Support Vector Machines and Deep Neural Networks achieved the best results on development set, which ranked 1st in the official results for Subtask A with F1-score of 90.51% on the test set.
IIITT@LT-EDI-EACL2021-Hope Speech Detection: There is always hope in Transformers
TLDR
This paper portrays the work for the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2021- EACL 2021 and works with several transformer-based models to classify social media comments as hope speech or not hope speech in English, Malayalam, and Tamil languages.
Detecting Offensive Tweets in Hindi-English Code-Switched Language
TLDR
A novel tweet dataset, titled Hindi- English Offensive Tweet (HEOT) dataset, consisting of tweets in Hindi-English code switched language split into three classes: non-offensive, abusive and hate-speech is introduced.
Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German
TLDR
This paper presents the HASOC track and its two parts, creating test collections for languages with few resources and English for comparison, and presents the tasks, the data and the main results.
NLP_Passau at SemEval-2020 Task 12: Multilingual Neural Network for Offensive Language Detection in English, Danish and Turkish
TLDR
This paper describes a neural network model that was used for participating in the OffensEval, Task 12 of the SemEval 2020 workshop, and achieved F1 scores of 90.88%, 76.76% and 76.70%, respectively.
Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
TLDR
A new approach to tackle the problem of offensive language in online social media using unsupervised text style transfer to translate offensive sentences into non-offensive ones and a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss is proposed.
UHH-LT at SemEval-2019 Task 6: Supervised vs. Unsupervised Transfer Learning for Offensive Language Detection
TLDR
Results indicate that unsupervised transfer from large datasets performs slightly better than supervised training on small ‘near target category’ datasets.
Automated Hate Speech Detection and the Problem of Offensive Language
TLDR
This work used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and labels a sample of these tweets into three categories: those containinghate speech, only offensive language, and those with neither.
...
...