Detection of Plagiarism in Arabic Documents

Abstract

Many language-sensitive tools for detecting plagiarism in natural language documents have been developed, particularly for English. Languageindependent tools exist as well, but are considered restrictive as they usually do not take into account specific language features. Detecting plagiarism in Arabic documents is particularly a challenging task because of the complex linguistic structure of Arabic. In this paper, we present a plagiarism detection tool for comparison of Arabic documents to identify potential similarities. The tool is based on a new comparison algorithm that uses heuristics to compare suspect documents at different hierarchical levels to avoid unnecessary comparisons. We evaluate its performance in terms of precision and recall on a large data set of Arabic documents, and show its capability in identifying direct and sophisticated copying, such as sentence reordering and synonym substitution. We also demonstrate its advantages over other plagiarism detection tools, including Turnitin, the well-known language-independent tool.

6 Figures and Tables

Cite this paper

@inproceedings{Menai2012DetectionOP, title={Detection of Plagiarism in Arabic Documents}, author={Mohamed El Bachir Menai}, year={2012} }