N-Gram Feature Selection for Authorship Identification
@inproceedings{Houvardas2006NGramFS, title={N-Gram Feature Selection for Authorship Identification}, author={John Houvardas and E. Stamatatos}, booktitle={AIMSA}, year={2006} }
Automatic authorship identification offers a valuable tool for supporting crime investigation and security. It can be seen as a multi-class, single-label text categorization task. Character n-grams are a very successful approach to represent text for stylistic purposes since they are able to capture nuances in lexical, syntactical, and structural level. So far, character n-grams of fixed length have been used for authorship identification. In this paper, we propose a variable-length n-gram… CONTINUE READING
Figures, Tables, and Topics from this paper
186 Citations
A Machine Learning Framework for Authorship Identification From Texts
- Computer Science, Mathematics
- ArXiv
- 2019
- 5
- PDF
Authorship Identification of E-mail as a Multi-Class Task - Notebook for PAN at CLEF 2011
- Computer Science
- CLEF
- 2011
- 4
- PDF
An Investigation of Supervised Learning Methods for Authorship Attribution in Short Hinglish Texts using Char & Word N-grams
- Computer Science
- ArXiv
- 2018
- 4
- PDF
Fan-Fictional Texts given variable length Character and Word N-Grams Notebook for PAN at CLEF 2019
- 2019
- 1
- PDF
Empirical Evaluations Using Character and Word N-Grams on Authorship Attribution for Telugu Text
- Psychology
- 2015
- 5
Authorship identification from unstructured texts
- Computer Science
- Knowl. Based Syst.
- 2014
- 43
- Highly Influenced
- PDF
References
SHOWING 1-10 OF 31 REFERENCES
Automatic Text Categorization in Terms of Genre and Author
- Computer Science
- Computational Linguistics
- 2000
- 438
- PDF
Language independent authorship attribution using character level language models
- Computer Science
- 2003
- 117
A repetition based measure for verification of text collections and for text categorization
- Computer Science
- SIGIR
- 2003
- 97
- PDF
A comparison of event models for naive bayes text classification
- Computer Science
- AAAI 1998
- 1998
- 3,585
- PDF
Applying authorship analysis to extremist-group Web forum messages
- Computer Science
- IEEE Intelligent Systems
- 2005
- 393
- PDF