CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection

  title={CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection},
  author={Parag Dutta and Souvic Chakraborty and Sumegh Roychowdhury and Animesh Mukherjee},
The last decade has witnessed a surge in the interaction of people through social networking platforms. While there are several positive aspects of these social platforms, their prolif-eration has led them to become the breeding ground for cyber-bullying and hate speech. Recent advances in NLP have often been used to mitigate the spread of such hateful content. Since the task of hate speech detection is usually applicable in the context of social networks, we introduce CRUSH , a framework for… 

Figures and Tables from this paper

Fast Few shot Self-attentive Semi-supervised Political Inclination Prediction

A self-attentive semi-supervised framework for political inclination detection is introduced that is highly ef-ficient even in resource-constrained settings, and insights drawn from its predictions match the manual survey outcomes when applied to diverse real-life scenarios.

Decoding Demographic un-fairness from Indian Names

The bias in the existing Indian system is assessed as case studies and some intriguing patterns manifesting in the complex demographic layout of the sub-continent across the dimensions of gender and caste are attempted.



Hate Me, Hate Me Not: Hate Speech Detection on Facebook

This work proposes a variety of hate categories and designs and implements two classifiers for the Italian language, based on different learning algorithms: the first based on Support Vector Machines (SVM) and the second on a particular Recurrent Neural Network named Long Short Term Memory (LSTM).

Hate begets Hate: A Temporal Study of Hate Speech.

The first temporal analysis of hate speech on, a social media site with very loose moderation policy, generates temporal snapshots of Gab from millions of posts and users and calculates an activity vector based on DeGroot model to identify hateful users.

Automatic Detection of Hate Speech on Facebook Using Sentiment and Emotion Analysis

A novel framework to effectively detect highly discussed topics that generate hate speech on Facebook is explored with the use of graph, sentiment, and emotion analysis techniques and is able to identify the pages that promote hate speech in the comment sections regarding sensitive topics automatically.

ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign statements about 13 minority groups, is created and it is demonstrated that finetuning a toxicity classifier on data improves its performance on human-written data substantially.

Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of Generated Hate Speech

This work utilizing the GPT LM for generating large amounts of synthetic hate speech sequences from available labeled examples, and leveraging the generated data in fine-tuning large pretrained LMs on hate detection improves generalization significantly and consistently within and across data distributions.

Ruddit: Norms of Offensiveness for English Reddit Comments

This work creates the first dataset of English language Reddit comments that has fine-grained, real-valued scores between -1 (maximally supportive) and 1 (maximsally offensive) and shows that the method produces highly reliable offensiveness scores.

Introducing CAD: the Contextual Abuse Dataset

A new dataset of primarily English Reddit entries which addresses several limitations of prior work, and contains six conceptually distinct primary categories as well as secondary categories, and uses an expert-driven group-adjudication process for high quality annotations.

Abuse is Contextual, What about NLP? The Role of Context in Abusive Language Annotation and Detection

This paper re-annotates part of a widely used dataset for abusive language detection in English in two conditions, i.e. with and without context, and argues that a context-aware classification is more challenging but also more similar to a real application scenario.

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

HateXplain is introduced, the first benchmark hate speech dataset covering multiple aspects of the issue and utilizes existing state-of-the-art models, observing that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.

HateBERT: Retraining BERT for Abusive Language Detection in English

HateBERT, a re-trained BERT model for abusive language detection in English, is introduced and a battery of experiments comparing the portability of the fine-tuned models across the datasets are discussed, suggesting that portability is affected by compatibility of the annotated phenomena.