Corpus ID: 633887

Author Obfuscation: Attacking the State of the Art in Authorship Verification

@inproceedings{Potthast2016AuthorOA,
  title={Author Obfuscation: Attacking the State of the Art in Authorship Verification},
  author={Martin Potthast and Matthias Hagen and Benno Stein},
  booktitle={CLEF},
  year={2016}
}
We report on the first large-scale evaluation of author obfuscation approaches built to attack authorship verification approaches: the impact of 3 obfuscators on the performance of a total of 44 authorship verification approaches has been measured and analyzed. The best-performing obfuscator successfully impacts the decision-making process of the authorship verifiers on average in about 47% of the cases, causing them to misjudge a given pair of documents as having been written by “different… Expand

Figures, Tables, and Topics from this paper

Overview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited
TLDR
There is still way to go to “perfect” automatic obfuscation that (1) tricks verification approaches, (2) keeps the meaning of the original, and (3) is, regarding its obfuscation, unsuspicious to a human eye. Expand
Heuristic Authorship Obfuscation
TLDR
A new obfuscation approach models writing style difference as the Jensen-Shannon distance between the character n-gram distributions of texts, and manipulates an author’s subconsciously encoded writing style in a sophisticated manner using heuristic search. Expand
On divergence-based author obfuscation: An attack on the state of the art in statistical authorship verification
TLDR
An approach that models writing style difference as the Jensen-Shannon distance between the character n-gram distributions of texts, and manipulates an author’s writing style in a sophisticated manner using heuristic search is introduced. Expand
Authorship Obfuscation Using Heuristic Search
In this thesis, we discuss the Jensen-Shannon divergence as a model for authorship and based on it, we present an approach to obfuscating a text’s authorship in order to impede automated authorshipExpand
Overview of the Author Obfuscation Task at PAN 2018: A New Approach to Measuring Safety
TLDR
A set of new performance measures are introduced which are designed to render the performance of obfuscation approaches comparable as the numbers of author identification approaches and evaluation datasets increases, incorporating their respective performance and quality. Expand
Authorship Verification: A Review of Recent Advances
TLDR
This paper presents a review of recent advances in authorship verification focusing on the evaluation results of PAN shared tasks and discusses successes, failures, and open issues. Expand
Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution
TLDR
It is found that it is possible for a non-expert adversary to defeat a source code attribution system designed to be adversarially resistant. Expand
Assessing the Applicability of Authorship Verification Methods
TLDR
This paper proposes clear criteria and properties that aim to improve the characterization of existing and future AV approaches, including the current state of the art, and identified that all involved methods are prone to cross-topic verification cases. Expand
Author Obfuscation on Indonesian News Articles Using Genetic Algorithms
TLDR
A genetic algorithm-based author obfuscation model was created to modify Indonesian news articles to avoid identification from authorship attribution while keeping its semantics. Expand
SU@PAN'2016: Author Obfuscation
TLDR
This paper presents the approach for hiding an author’s identity by masking their style, which was developed for the Author Obfuscation task, part of the PAN-2016 competition. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 125 REFERENCES
Poster : Assessing the Effectiveness of Countermeasures Against Authorship Recognition
Methods for authorship recognition were originally developed to aid in criminal investigations and attribution of historical texts. Nowadays, however, in an age in which the Internet has become theExpand
Empirical evaluation of authorship obfuscation using JGAAP
TLDR
This work uses a newly published corpus (the Brennan-Greenstadt Obfuscation corpus) and the JGAAP system to test different methods of authorship attribution against essays written in deliberate attempt to mask style. Expand
Secure Obfuscation of Authoring Style
TLDR
This paper provides a secure obfuscation scheme that is able to hide an author's document securely among other authors' documents in a corpus and presents a new algorithm for identifying anAuthor's unique words that would be of independent interest. Expand
Analyzing Stylometric Approaches to Author Obfuscation
TLDR
This paper analyzes the methods implemented in the Java Graphical Authorship Attribution Program (JGAAP) against essays in the Brennan-Greenstadt obfuscation corpus that were written in deliberate attempts to mask style. Expand
SU@PAN'2016: Author Obfuscation
TLDR
This paper presents the approach for hiding an author’s identity by masking their style, which was developed for the Author Obfuscation task, part of the PAN-2016 competition. Expand
Practical Attacks Against Authorship Recognition Techniques
TLDR
This paper presents a framework for adversarial attacks including obfuscation attacks, where a subject attempts to hide their identity imitation attacks,where a subject tries to frame another subject by imitating their writing style. Expand
Obfuscating Document Stylometry to Preserve Author Anonymity
TLDR
This paper explores techniques for reducing the effectiveness of standard authorship attribution techniques so that an author A can preserve anonymity for a particular document D and introduces two levels of anonymization: shallow and deep. Expand
Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity
TLDR
This research demonstrates that manual circumvention methods work very well while automated translation methods are not effective, and argues that this field is important to a multidisciplinary approach to privacy, security, and anonymity. Expand
Detecting Hoaxes, Frauds, and Deception in Writing Style Online
TLDR
It is shown that using a large feature set, it is possible to distinguish regular documents from deceptive documents with 96.6% accuracy (F-measure) and an analysis of linguistic features that can be modified to hide writing style is presented. Expand
Local n-grams for Author Identification Notebook for PAN at CLEF 2013
TLDR
This approach came in third for this year’s PAN 2013 competition, using a relatively simple scheme of weights by training set accuracy, using existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. Expand
...
1
2
3
4
5
...