Algorithms and Corpora for Persian Plagiarism Detection: Overview of PAN at FIRE 2016

@inproceedings{Asghari2016AlgorithmsAC,
  title={Algorithms and Corpora for Persian Plagiarism Detection: Overview of PAN at FIRE 2016},
  author={Habibollah Asghari and Salar Mohtaj and Omid Fatemi and Heshaam Faili and Paolo Rosso and Martin Potthast},
  booktitle={Fire},
  year={2016}
}
The task of plagiarism detection is to find passages of text-reuse in a suspicious document. [] Key Result In the first subtask, nine teams participated, whereas the best result achieved was a PlagDet score of 0.92. For the second subtask of corpus construction, five teams submitted a corpus, which were evaluated using the systems submitted for the first subtask. The results show that significant challenges remain in evaluating newly constructed corpora.

A Deep Learning Approach to Persian Plagiarism Detection

In this paper, a deep learning based method to detect plagiarism is proposed, words are represented as multi-dimensional vectors, and simple aggregation methods are used to combine the word vectors for sentence representation.

Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts

Haitajoo, a Persian plagiarism detection system for academic manuscripts is introduced and the overall structure of the system along with the algorithms used in each stage are described.

A crowdsourcing approach to construct mono-lingual plagiarism detection corpus

This paper proposes HAMTA, a Persian plagiarism detection corpus, a crowdsourcing platform is developed and crowd workers are asked to paraphrase fragments of text in order to simulate real cases of plagiarism.

A crowdsourcing approach to construct mono-lingual plagiarism detection corpus

HAMTA, a Persian plagiarism detection corpus is proposed and evaluation results indicate a high correlation between the proposed corpus and the PAN state-of-the-art English plagiarism Detection corpus.

Using Local Text Similarity in Pairwise Document Analysis for Monolingual Plagiarism Detection

To retrieve plagiarised passages this paper presents a pairwise plagiarism detection algorithm based on a vector space model considering the proximity of the terms and evaluates the performance in terms of precision, recall, granularity and Plagdet metrics.

ParsiPayesh: Persian Plagiarism Detection based on Semantic and Structural Analysis

The results indicate that structural and semantic information improves the performance of the proposed method, and the suggestion to examine the semantic similarity of expression is to use the semantic role labeling obtained from the deep learning model presented.

Persian Plagiarism Detection Using Sentence Correlations

This report explains the Persian plagiarism detection system which was used to submit its run to Persian PlagDet competition at FIRE 2016 and performance measures on the training corpus were promising.

Academic Plagiarism Detection

The integration of heterogeneous analysis methods for textual and non-textual content features using machine learning is seen as the most promising area for future research contributions to improve the detection of academic plagiarism further.

Academic Plagiarism Detection: A Systematic Literature Review

The integration of heterogeneous analysis methods for textual and non-textual content features using machine learning is seen as the most promising area for future research contributions to improve the detection of academic plagiarism further.

A Pairwise Document Analysis Approach for Monolingual Plagiarism Detection

To retrieve plagiarised passages a plagiarism detection method based on vector space model, insensitive to context reordering, is presented and evaluated in terms of precision, recall, granularity and plagdet metrics.

References

SHOWING 1-10 OF 36 REFERENCES

A Deep Learning Approach to Persian Plagiarism Detection

In this paper, a deep learning based method to detect plagiarism is proposed, words are represented as multi-dimensional vectors, and simple aggregation methods are used to combine the word vectors for sentence representation.

Persian Plagiarism Detection Using Sentence Correlations

This report explains the Persian plagiarism detection system which was used to submit its run to Persian PlagDet competition at FIRE 2016 and performance measures on the training corpus were promising.

The Short Stories Corpus: Notebook for PAN at CLEF 2015

This work describes the construction of a plagiarism detection/text reuse corpus submitted for the PAN-2015 Evaluation Lab and finds patterns of textual similarity between story retellings within the corpus.

A Pairwise Document Analysis Approach for Monolingual Plagiarism Detection

To retrieve plagiarised passages a plagiarism detection method based on vector space model, insensitive to context reordering, is presented and evaluated in terms of precision, recall, granularity and plagdet metrics.

Approaches for Source Retrieval and Text Alignment of Plagiarism Detection Notebook for PAN at CLEF 2013

This paper describes the approach at the PAN@CLEF2013 plagiarism detection competition, and proposes a method based on sentence similarity to extract the keywords of suspicious documents as queries to retrieve the plagiarism source document.

Graph-based Approach to Text Alignment for Plagiarism Detection in Persian Documents

This paper presents a new approach for Persian plagiarism detection. This approach uses a graph structure as well as one of the graph similarity methods (iterative methods) for similarity detection

Overview of the AraPlagDet PAN@FIRE2015 Shared Task on Arabic Plagiarism Detection

An overview paper describes these evaluation corpora of plagiarism detection methods for Arabic texts, discusses the participants' methods, and highlights their building blocks that could be language dependent.

Developing Bilingual Plagiarism Detection Corpus Using Sentence Aligned Parallel Corpus: Notebook for PAN at CLEF 2015

A bilingual Persian-English sentence aligned parallel corpus in a combination with Wikipedia articles is used to create a plagiarism detection corpus based on parallel corpus sentences.

Evaluation of Text Reuse Corpora for Text Alignment Task of plagiarism Detection

This paper addresses the text alignment task of 7th International competition on plagiarism detection; PAN 2015 and finds that the most of pla- giarism cases in prepared corporahavea rather high quality in term of "rate of obfuscation" alongside "preserving the concepts".

Developing Monolingual Persian Corpus for Extrinsic Plagiarism Detection Using Artificial Obfuscation: Notebook for PAN at CLEF 2015

The approach for construction of a monolingual Persian plagia- rism corpus that can be used to evaluate the performance of Persian plagiarism detection systems is described.