• Corpus ID: 215822346

Overview of the 6th International Competition on Plagiarism Detection

@inproceedings{Potthast2014OverviewOT,
  title={Overview of the 6th International Competition on Plagiarism Detection},
  author={Martin Potthast and Matthias Hagen and Tim Gollub and Martin Tippmann and Johannes Kiesel and Paolo Rosso and Efstathios Stamatatos and Benno Stein},
  booktitle={CLEF},
  year={2014}
}
Thispaper overviews 18 plagiarism detectors that have been developed and evaluated within PAN'10. We start with a unified retrieval process that sum- marizes the best practices employed this year. Then, the detectors' performances are evaluated in detail, highlighting several important aspects of plagiarism de- tection, such as obfuscation, intrinsic vs. external plagiarism, and plagiarism case length. Finally, all results are compared to those of last year's competition. 
HawkEyes Plagiarism Detection System
TLDR
HawkEyes, a plagiarism detection system implemented based on the source retrieval and text alignment algorithms which developed for the international competition on plagiarism Detection organized by CLEF, is proposed.
Plagiarism Detection: An Overview of Text Alignment Techniques
TLDR
This thesis is mainly concerned with the detailed analysis phase, more specifically with the problem of text alignment and the other subtasks that follow from it.
Diverse Queries and Feature Type Selection for Plagiarism Discovery Notebook for PAN at CLEF 2013
This paper describes approaches used for the Plagiarism Detection task in PAN 2013 international competition on uncovering plagiarism, authorship, and social software misuse. We present modified
A Pairwise Document Analysis Approach for Monolingual Plagiarism Detection
TLDR
To retrieve plagiarised passages a plagiarism detection method based on vector space model, insensitive to context reordering, is presented and evaluated in terms of precision, recall, granularity and plagdet metrics.
Evaluating AdaBoost for Plagiarism Detection
TLDR
This work proposes and evaluates the adoption of AdaBoost for classifying suspicious text passages as plagiarism or not, and presents a simple post-processing heuristic for improving results granularity.
Plagiarism Detection - State-of-the-art systems (2016) and evaluation methods
TLDR
The current research situation in the field of plagiarism detection is asses, a taxonomy provided in former research is used, to classify recent approaches and further research questions and approaches to be tackled in the future are derived.
Overview of the AraPlagDet PAN@FIRE2015 Shared Task on Arabic Plagiarism Detection
TLDR
An overview paper describes these evaluation corpora of plagiarism detection methods for Arabic texts, discusses the participants' methods, and highlights their building blocks that could be language dependent.
Improved Evaluation Framework for Complex Plagiarism Detection
TLDR
This paper study's the performance of plagdet, the main measure for plagiarim detection, on manually paraphrased datasets, reveals its fallibility under certain conditions and proposes an evaluation framework with normalization of inner terms, which is resilient to the dataset imbalance.
Approaches for Source Retrieval and Text Alignment of Plagiarism Detection Notebook for PAN at CLEF 2013
TLDR
This paper describes the approach at the PAN@CLEF2013 plagiarism detection competition, and proposes a method based on sentence similarity to extract the keywords of suspicious documents as queries to retrieve the plagiarism source document.
Improving Plagiarism Detection
TLDR
This thesis compares the performance of commonly available plagiarism detectors to the solution used in the student task submission system used at CTU FEE, named BRUTE and describes a solution based on enhanced suffix arrays which mitigates them.
...
...

References

SHOWING 1-10 OF 174 REFERENCES
A Plagiarism Detector for Intrinsic Plagiarism - Lab Report for PAN at CLEF 2010
TLDR
The algorithm is based on the LempelZiv distance, which is applied to extract structural information from texts and tries to find outliers in the vector of distances between each fragment of the text and the whole document itself.
Improving the Reliability of the Plagiarism Detection System - Lab Report for PAN at CLEF 2010
TLDR
This paper describes the approach at the PAN 2010 plagiarism detection competition, and discusses the com- putational cost of each step of the implementation, including the performance data from two different computers.
An Evaluation Framework for Plagiarism Detection
TLDR
Empirical evidence is given that the construction of tailored training corpora for plagiarism detection can be automated, and hence be done on a large scale.
Encoplot - Performance in the Second International Plagiarism Detection Challenge - Lab Report for PAN at CLEF 2010
TLDR
This year's submission is generated by the same method Encoplot that was developed for the last year competition and there is a single improvement.
A Set-Based Approach to Plagiarism Detection Notebook for PAN at CLEF 2012
TLDR
The approach to the Detailed Analysis subtask of the PAN 2012 competition is described, which uses a simple set-based algorithm, that employs Dice's coefficient as a similarity measure, and employs basic strategies from Informa- tion Retrieval and Natural Language Processing for stop word removal and lan- guage detection.
Diverse Queries and Feature Type Selection for Plagiarism Discovery Notebook for PAN at CLEF 2013
This paper describes approaches used for the Plagiarism Detection task in PAN 2013 international competition on uncovering plagiarism, authorship, and social software misuse. We present modified
The Encoplot Similarity Measure for Automatic Detection of Plagiarism - Notebook for PAN at CLEF 2011
TLDR
The main novelties are the introduction of a new similarity measure and a new ranking method, which cooperate to rank much better the source– suspicious document pairs when selecting the candidates for the detailed analysis phase.
Evaluating Robustness for 'IPCRESS': Surrey's Text Alignment for Plagiarism Detection
TLDR
This paper briefly describes the approach taken to the subtask of Text Alignment in the Plagiarism Detection track at PAN 14, and presents results from this re-implementation with respect to various PAN collections.
Tackling the PAN’09 External Plagiarism Detection Corpus with a Desktop Plaigiarism Detector
TLDR
Ferret was able to detect numerous files in the development corpus that contain substantial similarities not marked as plagiarism, but it also identified quite a lot of pairs where random similarities masked actual plagiarism.
UFRGS@PAN2010: Detecting External Plagiarism - Lab Report for PAN at CLEF 2010
TLDR
The method is designed to detect cross- language plagiarism and is composed by five phases: language normalization, retrieval of candidate documents, classifier traini ng, plagiarism analysis, and post-processing.
...
...