Predicting the Type and Target of Offensive Posts in Social Media

@inproceedings{Zampieri2019PredictingTT,
  title={Predicting the Type and Target of Offensive Posts in Social Media},
  author={Marcos Zampieri and Shervin Malmasi and Preslav Nakov and Sara Rosenthal and Noura Farra and Ritesh Kumar},
  booktitle={NAACL},
  year={2019}
}
As offensive content has become pervasive in social media, there has been much research in identifying potentially offensive messages. However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offensive content. In particular, we model the task hierarchically, identifying the type and the target… Expand
An Empirical Study of Offensive Language in Online Interactions
In the past decade, usage of social media platforms has increased significantly. People use these platforms to connect with friends and family, share information, news and opinions. Platforms such asExpand
Auto-Off ID: Automatic Detection of Offensive Language in Social Media
As the popularity of social media grows, computer-mediated anonymity allows users to engage in activities that they would not do in real life. This makes users vulnerable to abuse through InternetExpand
Detecting Abusive Albanian
TLDR
An annotated Albanian dataset for hate speech and offensive speech that has been constructed from user-generated content on various social media platforms and follows the hierarchical schema introduced in Zampieri et al. (2019b). Expand
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
TLDR
The results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval), based on a new dataset, contain over 14,000 English tweets, are presented. Expand
Offensive Language and Hate Speech Detection for Danish
TLDR
This work constructs a Danish dataset DKhate containing user-generated comments from various social media platforms, and to the authors' knowledge, the first of its kind, annotated for various types and target of offensive language, and develops four automatic classification systems designed to work for both the English and the Danish language. Expand
DLRG@HASOC 2020: A Hybrid Approach for Hate and Offensive Content Identification in Multilingual Tweets
TLDR
The proposed approach, Multi-class imbalance-based feature selection method is combined with an SVM classifier to classify the tweet as a hate speech or not and has achieved an accuracy of 80% and 72% on the released German and Hindi language tweets respectively. Expand
Offensive Language Detection in Nepali Social Media
Social media texts such as blog posts, comments, and tweets often contain offensive languages including racial hate speech comments, personal attacks, and sexual harassment. Detecting inappropriateExpand
SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification
TLDR
This work creates the largest available dataset for this task, SOLID, which contains over nine million English tweets labeled in a semi-supervised manner, and demonstrates experimentally that using SOLID along with OLID yields improved performance on the OLID test set for two different models, especially for the lower levels of the taxonomy. Expand
Pardeep at SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media using Deep Learning
TLDR
The proposed approach solves 3 different Sub-tasks provided in the SemEval-2019 task 6 which incorporates identification of offensive tweets as well as their categorization, validating the fact that the proposed models can be used for automating the offensive post-detection task in social media. Expand
Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata
The popularity of social networks has only increased in recent years. In theory, the use of social media was proposed so we could share our views online, keep in contact with loved ones or share goodExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 19 REFERENCES
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
TLDR
The results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval), based on a new dataset, contain over 14,000 English tweets, are presented. Expand
Automated Hate Speech Detection and the Problem of Offensive Language
TLDR
This work used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and labels a sample of these tweets into three categories: those containinghate speech, only offensive language, and those with neither. Expand
Benchmarking Aggression Identification in Social Media
TLDR
The Shared Task on Aggression Identification organised as part of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC - 1) at COLING 2018 was to develop a classifier that could discriminate between Overtly Aggression, Covertly Aggressive, and Non-aggressive texts. Expand
Learning from Bullying Traces in Social Media
TLDR
Evidence is presented that social media, with appropriate natural language processing techniques, can be a valuable and abundant data source for the study of bullying in both worlds. Expand
Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
TLDR
It is demonstrated how the results of the classifier can be robustly utilized in a statistical model used to forecast the likely spread of cyber hate in a sample of Twitter data. Expand
Detecting Hate Speech in Social Media
TLDR
This paper aims to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose, and obtains results of 78% accuracy in identifying posts across three classes. Expand
Abusive Language Detection on Arabic Social Media
TLDR
A list of obscene words and hashtags is extracted using common patterns used in offensive and rude communications and Twitter users are classified according to whether they use any of these words or not in their tweets. Expand
Locate the Hate: Detecting Tweets against Blacks
TLDR
A supervised machine learning approach is applied, employing inexpensively acquired labeled data from diverse Twitter accounts to learn a binary classifier for the labels "racist" and "nonracist", suggesting that with further improvements, this work can contribute data on the sources of anti-black hate speech. Expand
Abusive Language Detection in Online User Content
TLDR
A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind. Expand
Challenges in discriminating profanity from hate speech
TLDR
Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface -grams. Expand
...
1
2
...