Intrinsic Plagiarism Detection Using Character Trigram Distance Scores - Notebook for PAN at CLEF 2011

@inproceedings{Kestemont2011IntrinsicPD,
  title={Intrinsic Plagiarism Detection Using Character Trigram Distance Scores - Notebook for PAN at CLEF 2011},
  author={Mike Kestemont and Kim Luyckx and Walter Daelemans},
  booktitle={CLEF},
  year={2011}
}
In this paper, we describe a novel approach to intrinsic plagiarism detection. Each suspicious document is divided into a series of consecutive, potentially overlapping ‘windows’ of equal size. These are represented by vectors containing the relative frequencies of a predetermined set of high-frequency character trigrams. Subsequently, a distance matrix is set up in which each of the document’s windows is compared to each other window. The distance measure used is a symmetric adaptation of the… CONTINUE READING

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.

References

Publications referenced by this paper.
Showing 1-10 of 17 references

Overview of the 1st International Competition on Plagiarism Detection

  • M. Potthast, B. Stein, A. Eiselt, A. Barrón-Cedeño, P. Rosso
  • Proceedings of the 3rd Workshop on Uncovering…
  • 2009
2 Excerpts