• Corpus ID: 9071039

Antisocial Behavior corpus for harmful language detection

  title={Antisocial Behavior corpus for harmful language detection},
  author={Myriam Munezero and Maxim Mozgovoy and Tuomo Kakkonen and Vitaly Klyuev and Erkki Sutinen},
  journal={2013 Federated Conference on Computer Science and Information Systems},
We report on experiments that demonstrate the relevance of our AntiSocial Behavior (ASB) corpus as a machine learning resource to detect antisocial behavior from text. We first describe the corpus and then, by using the corpus for training machine learning algorithms, we build a set of binary classifiers. Experimental evaluations revealed that classifiers built based on the ASB corpus produce reliable classification results with up to 98% accuracy. We believe that the dataset will be valuable… 

Figures and Tables from this paper

Automatic Detection of Antisocial Behaviour in Texts

This work uses an ASB text corpus collected as a machine learning resource and approaches the detection of ASB in text as a binary classification problem where discriminating features are taken from the linguistic representation of the text in the form bag-of-words and ontology-based emotion descriptors.

Anti social comment classification based on kNN algorithm

  • Nidhi ChandraS. KhatriS. Som
  • Computer Science
    2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)
  • 2017
The aim of this paper is to identify comments and posts which are racist and malicious in nature so that they could be effetely banned and removed in order to counter them.

Approaches to Automated Detection of Cyberbullying: A Survey

This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in this field.

When the Timeline Meets the Pipeline: A Survey on Automated Cyberbullying Detection

The results show that BERT outperforms state-of-the-art cyberbullying detection models and deep learning models andDeep learning models initialized with slang-based word embeddings outperform deep learning Models initialized with traditional word embedDings.

A Framework for Cyberbullying Detection in Social Network

A framework deployed for the detecting negative online interactions in terms of abusive contents carried out through text messages as well as images is proposed and the combination of text & image analysis techniques is considered as a suitable platform for the detection of potential cyber bullying threats.

A lexical database for public textual cyberbullying detection

It is argued that one of the cornerstones in the overall process of mitigating the effects of cyberbullying is the design of a cyberbullies lexical database that specifies what linguistic and cyberBullying specific information is relevant to the detection process.

Using Fuzzy Sets for Detecting Cyber Terrorism and Extremism in the Text

The experimental analysis shows that fuzzy set based weighting method with SVM classifier gives the best classification accuracy which reaches up to 99%.

Towards automated e-counselling system based on counsellors emotion perception

It is speculated, based on the findings, that the emotional state of counsellors influences their emotion perception while tracking emotions in text, and the advantages of using an automated e-counselling system for emotion analysis are discussed.

Survey of Text Categorization Techniques

The different text categorization systems are considered, using different classification algorithms for he classification of the text documents, which will be helpful in many cases.



Modeling the Detection of Textual Cyberbullying

This work decomposes the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems and shows that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.

Thumbs up? Sentiment Classification using Machine Learning Techniques

This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.

Neurobiology of Antisociality

This chapter will explore the current literature on the neurobiology of antisociality. The term “antisociality” will be used in distinction to “criminality”, which can refer to behavior defined

Detection of Harassment on Web 2.0

This paper uses a supervised learning approach for harassment that employs content features, sentiment features, and contextual features of documents and achieves significant improvements over several baselines, including Term Frequency- Inverse Document Frequency (TFIDF) approaches.

Modelling Fixated Discourse in Chats with Cyberpedophiles

A considerable variation in the length of sex-related lexical chains according to the nature of the corpus supports the belief that this could be a valuable feature in an automated pedophile detection system.

Affect Intensity Analysis of Dark Web Forums

An affect lexicon is constructed using a probabilistic disambiguation technique to measure the usage of violence and hate affects and reveals that the Middle Eastern test bed forums have considerably greater violence intensity than the U.S. groups.

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

A novel machine-learning method is proposed that applies text-categorization techniques to just the subjective portions of the document, which greatly facilitates incorporation of cross-sentence contextual constraints.

University Language: A corpus-based study of spoken and written registers

This book describes university registers from several different perspectives, including: vocabulary patterns; the use of lexico-grammatical and syntactic features; the expression of stance; theUse of extended collocations ('lexical bundles'); and a Multi-Dimensional analysis of the overall patterns of register variation.

National Center for the Analysis of Violent Crime

Computerized records are maintained for one year and hard copy computer listings are maintained for six months. Cards containing badge information are destroyed when administrative needs have

Evidence for universality and cultural variation of differential emotion response patterning.

The empirical evidence is interpreted as supporting theories that postulate both a high degree of universality of differential emotion patterning and important cultural differences in emotion elicitation, regulation, symbolic representation, and social sharing.