Authorship Analysis of Online Predatory Conversations using Character Level Convolution Neural Networks

  title={Authorship Analysis of Online Predatory Conversations using Character Level Convolution Neural Networks},
  author={Kanishka Misra and Hemanth Devarapalli and Tatiana R. Ringenberg and Julia Taylor Rayz},
  journal={2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)},
Authorship Attribution (AA) of written content presents several advantages within the digital forensics domain. While AA has been traditionally applied to long documents, recent works have shown improved performance of neural AA models on short texts such as tweets and online conversations Concurrently, the rise of social media as well as a plethora of chat messaging platforms have made it easier for teenagers to be vulnerable to online predators. In this work, we present an authorship… 

Figures and Tables from this paper

Implications of Using Internet Sting Corpora to Approximate Underage Victims
The corpus is annotated for stages and tactics of the victimization process described within psychology research and found significant differences in the three classes of chats that are usually not taken into account in chat classification.
A Human-Centered Systematic Literature Review of the Computational Approaches for Online Sexual Risk Detection
A comprehensive literature review to analyze 73 peer-reviewed articles on computational approaches utilizing text or meta-data/multimedia for online sexual risk detection found that the majority of work has focused on identifying sexual predators after-the-fact, rather than taking more nuanced approaches to identify potential victims and problematic patterns that could be used to prevent victimization before it occurs.


Authorship Attribution of Micro-Messages
The concept of an author’s unique “signature” is introduced, and it is shown that such signatures are typical of many authors when writing very short texts.
Authorship Attribution of Short Messages Using Multimodal Features
A multimodal classifier for authorship attribution of short messages is developed to show that the combination of natural-language and network-feature classifiers identifies a user to phone binding better than when the individual classifiers are used independently.
Authorship Attribution in Greek Tweets Using Author's Multilevel N-Gram Profiles
The first Modern Greek Twitter corpus consisted of 12,973 tweets crawled from 10 Greek popular users and was used to study the effectiveness of a specific document representation called Author’s Multilevel N-gram Profile (AMNP) and the impact of different methods on training data construction for the task of authorship attribution.
A survey of modern authorship attribution methods
A survey of recent advances of the automated approaches to attributing authorship is presented, examining their characteristics for both text representation and text classification.
Convolutional Neural Networks for Authorship Attribution of Short Texts
A model to perform authorship attribution of tweets using Convolutional Neural Networks over character n-grams and a strategy that improves model interpretability by estimating the importance of input text fragments in the predicted classification are presented.
Exploiting Stylistic Idiosyncrasies for Authorship Attribution
Introduction Early researchers in authorship attribution used a variety of statistical methods to identify stylistic discriminators – characteristics which remain approximately invariant within the
Overview of the Author Identification Task at PAN-2018: Cross-domain Authorship Attribution and Style Change Detection
This edition of PAN studies two task, the novel task of cross-domain authorship attribution, where the texts of known and unknown authorship belong to different domains, and style change detection, where single-author and multi-author texts are to be distinguished.
Detecting predatory conversations in social media by deep Convolutional Neural Networks
Continuous N-gram Representations for Authorship Attribution
This paper presents work on using continuous representations for authorship attribution via a neural network jointly with the classification layer, and demonstrates that the proposed model outperforms the state-of-the-art on two datasets.
Authorship attribution in the wild
This paper shows the precise relationship between attribution precision and four parameters: the size of the candidate set, the quantity of known-text by the candidates, the length of the anonymous text and a certain robustness score associated with a attribution.