• Corpus ID: 4749650

Automatic Detection of Online Jihadist Hate Speech

  title={Automatic Detection of Online Jihadist Hate Speech},
  author={Tom De Smedt and Guy De Pauw and Pieter Van Ostaeyen},
We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss… 

Multilingual Cross-domain Perspectives on Online Hate Speech

In this report, we present a study of eight corpora of online hate speech, by demonstrating the NLP techniques that we used to collect and analyze the jihadist, extremist, racist, and sexist content.

Automatic Classification and Linguistic Analysis of Extremist Online Material

A system for recognition, classification and inspection of this kind of material in terms of different characteristics and identification of its authors is presented.

Detection of Abusive Speech for Mixed Sociolects of Russian and Ukrainian Languages

It is demonstrated that this approach to use unsupervised probabilistic technique with a seed dictionary for detecting abusive comments in social media in Russian and Ukrainian languages is feasible and is able to detect abusive terms that are not present in the seed dictionary.

A Survey on Multilingual Hate Speech Detection and Classification by Machine Learning Techniques

: It is critical to identify hate speech on social media. The Spread of uncontrolled hate can damage society, marginalized people, or groups. Social media plays a significant role in hate speech


This paper aims to present its work for detecting hate speech over Twitter platform as one of the main Online Social Networks (OSN) based on Arabic language and shows consistently high performance and outperformed other classifiers, and TF-IDF outperformed BoW, which consequently achieved the highest accuracy.

Quick and Simple Approach for Detecting Hate Speech in Arabic Tweets

A “quick and simple” approach to tackle the problem of detecting hate speech in Arabic tweets by investigating the effectiveness of 15 classical and neural learning models, while exploring two different term representations.

Hate Speech Detection Using Natural Language Processing: Applications and Challenges

Different types of hate speech like racism, sexism, religious hate speech, etc. and the various methods proposed to tackle them are discussed and the challenges identified are identified and the solutions proposed are proposed.

A Deep Learning Approach for Automatic Hate Speech Detection in the Saudi Twittersphere

This paper aimed to investigate several neural network models based on convolutional neural network (CNN) and recurrent neuralnetwork (RNN) to detect hate speech in Arabic tweets and evaluated the recent language representation model bidirectional encoder representations from transformers (BERT) on the task of Arabic hate speech detection.

Measuring and Characterizing Hate Speech on News Websites

A large-scale quantitative analysis of 125M comments posted on 412K news articles over the course of 19 months finds statistically significant increases in hateful commenting activity around real-world divisive events like the “Unite the Right” rally in Charlottesville and political eventslike the second and third 2016 US presidential debates.

Jihadists on Social Media : A Critique of Data Collection Methodologies

A general model of data collection from social media, in the context of terrorism research, focusing on recent studies of jihadists is proposed, showing that the methods used are prone to sampling biases, and that the sampled datasets are not sufficiently filtered or validated to ensure reliability of conclusions derived from them.



Detecting Jihadist Messages on Twitter

This work makes a first attempt to automatically detect messages released by jihadist groups on Twitter using a machine learning approach that classifies a tweet as containing material that is supporting jihadists groups or not.

Detecting Hate Speech on the World Wide Web

The definition of hate speech, the collection and annotation of the hate speech corpus, and a mechanism for detecting some commonly used methods of evading common "dirty word" filters are described.

The YouTube Jihadists: A Social Network Analysis of Al-Muhajiroun’s Propaganda Campaign

Producers of Al-Qaeda inspired propaganda have shifted their operations in recent years from closed membership online forums to mainstream social networking platforms. Using social network analysis,

Examining ISIS Support and Opposition Networks on Twitter

A mixed-methods analytic approach is used to identify and characterize in detail both ISIS support and opposition networks on Twitter to better understand the networks of ISIS supporters and opponents.

Online extremism and the communities that sustain it: Detecting the ISIS supporting community on Twitter

Iterative Vertex Clustering and Classification (IVCC), a scalable analytic approach for OEC detection in annotated heterogeneous networks, is presented and an illustrative case study of an online community of over 22,000 Twitter users whose online behavior directly advocates support for ISIS or contibutes to the group’s propaganda dissemination through retweets is provided.

From Keywords to Discursive Legitimation: Representing 'kuffar' in the Jihadist Propaganda Magazines

The aim of this paper is to show how IS and Al-Qaeda discursively construct “kuffar” in their propaganda magazines and legitimize this as a negative social identity, and to contribute to the current academic debate about one of the main analytic tools in Corpus Assisted Discourse Studies (CADS): keywords.

Understanding terror networks.

  • M. Sageman
  • Political Science
    International journal of emergency mental health
  • 2005
The origins of the Jihad, the Mujahedin, and social networks and the Jihad: Names of Terrorists Glossary of Foreign-Language Terms are reviewed.

Terrorist Migration to the Dark Web

Some of the recent trends in terrorist use of the Dark Web for communication, fundraising, storing information and online material are reported.

"Vreselijk mooi!" (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives

A new open source subjectivity lexicon for Dutch adjectives, a dictionary of 1,100 adjectives that occur frequently in online product reviews, manually annotated with polarity strength, subjectivity and intensity, for each word sense is presented.

Hate in Cyberspace: Regulating Hate Speech on the Internet

The Internet is a global network providing connections for many forms of speech. All the processes of message transmission occur in real space through a system of identifiable algorithms. The