• Publications
  • Influence
KanCMD: Kannada CodeMixed Dataset for Sentiment Analysis and Offensive Language Detection
TLDR
The KanCMD dataset contains actual comments in code mixed text posted by users on YouTube social media, rather than in monolingual text from the textbook, and has been annotated for two tasks, namely sentiment analysis and offensive language detection for under-resourced Kannada language.
Hope Speech detection in under-resourced Kannada language
TLDR
An English-Kannada Hope speech dataset, KanHope is proposed and DC-BERT4HOPE, a dual-channel model that uses the Adeep Hande Indian Institute of Information Technology Tiruchirappalli, Tamil Nadu, India is introduced, bettering other models.
UVCE-IIITT@DravidianLangTech-EACL2021: Tamil Troll Meme Classification: You need to Pay more Attention
TLDR
This work presents an ingenious model consisting of transformer-transformer architecture that tries to attain state of the art by using attention as its main component of troll and non-troll Tamil memes.
Attentive fine-tuning of Transformers for Translation of low-resourced languages @LoResMT 2021
TLDR
This paper reports the Machine Translation systems submitted by the IIITT team for the English→Marathi and English⇔Irish language pairs LoResMT 2021 shared task, and fine-tune IndicTrans, a pretrained multilingual NMT model for English→ Marathi, using external parallel corpus as input for additional training.
IIITT@LT-EDI-EACL2021-Hope Speech Detection: There is always hope in Transformers
TLDR
This paper portrays the work for the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2021- EACL 2021 and works with several transformer-based models to classify social media comments as hope speech or not hope speech in English, Malayalam, and Tamil languages.
IIITT at CASE 2021 Task 1: Leveraging Pretrained Language Models for Multilingual Protest Detection
TLDR
This paper demonstrates its work on the sentence classification subtask of multilingual protest detection in CASE@ACL-IJCNLP 2021 by employing various multilingual pre-trained transformer models to classify if any sentence contains information about an event that has transpired or not.
IIITT@DravidianLangTech-EACL2021: Transfer Learning for Offensive Language Detection in Dravidian Languages
TLDR
This paper approaches this challenge with various transfer learning-based models to classify a given post or comment in Dravidian languages (Malayalam, Tamil, and Kannada) into 6 categories to identify the offensive language in multilingual posts that are largely code-mixed or written in a non-native script.
Findings of the Shared Task on Emotion Analysis in Tamil
TLDR
The dataset used in the shared task, task description, and the methodology used by the participants and the evaluation results of the submission are presented.
Benchmarking Multi-Task Learning for Sentiment Analysis and Offensive Language Identification in Under-Resourced Dravidian Languages
TLDR
Analysis of fine-tuned models indicates the preference of multi-task learning over single- task learning resulting in a higher weighted F1-score on all three languages, including Kannada, Malayalam and Tamil.
Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling
TLDR
This work intends to classify code-mixed social media comments/posts in the Dravidian languages of Tamil, Kannada, andMalayalam to improve offensive language identification by generating pseudo-labels on the dataset.
...
...