CitePlag : A Citation-based Plagiarism Detection System Prototype
@inproceedings{Meuschke2012CitePlagA, title={CitePlag : A Citation-based Plagiarism Detection System Prototype}, author={Norman Meuschke and Bela Gipp and Corinna Breitinger}, year={2012} }
This paper presents an open-source prototype of a citation-based plagiarism detection system called CitePlag. [] Key Method The algorithms consider multiple citation-related factors such as proximity and order of citations within the text, or their probability of co-occurrence in order to compute document similarity scores. We present technical details of CitePlag’s detection algorithms and the acquisition of test data from the PubMed Central Open Access Subset. Future advancement of the prototype lies in…
22 Citations
Comparing and combining Content‐ and Citation‐based approaches for plagiarism detection
- Computer ScienceJ. Assoc. Inf. Sci. Technol.
- 2016
This work compares content and citation‐based approaches for plagiarism detection with the goal of evaluating whether they are complementary and if their combination can improve the quality of the detection and concluded that a combination of the methods can be beneficial.
Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts
- Computer ScienceArXiv
- 2021
Haitajoo, a Persian plagiarism detection system for academic manuscripts is introduced and the overall structure of the system along with the algorithms used in each stage are described.
Integrating syntax‐semantic‐based text analysis with structural and citation information for scientific plagiarism detection
- Computer ScienceJ. Assoc. Inf. Sci. Technol.
- 2018
The proposed plagiarism detection system employs the effective coupling of various modules, namely, logical structure classifications and citation parsing, two‐stage candidate document selections, syntax‐semantic‐based exhaustive passage level analysis with plagiarism analysis using structural and citation information.
NeoPlag: An Ecosystem to Support the Development and Evaluation of New Algorithms to Detect Plagiarism
- Computer Science2015 Asia-Pacific Conference on Computer Aided System Engineering
- 2015
A novel ecosystem to provide support during the development process of new algorithms to detect plagiarism, test the existing algorithms or perform benchmarking analysis, and developed and uploaded into system a basic detection algorithm based on vector space model.
State-of-the-art in detecting academic plagiarism
- Computer Science
- 2013
In the future, plagiarism detection systems may benefit from combining traditional character-based detection methods with these emerging detection approaches, including intrinsic, cross-lingual and citation-based plagiarism Detection.
Text Mining for Plagiarism Detection: Multivariate Pattern Detection for Recognition of Text Similarities
- Computer Science2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
- 2018
A text mining methodology is proposed that can detect all common patterns between a document and the documents in a reference database and has been applied in a well-defined dataset providing very promising results identifying difficult cases of plagiarism such as technical disguise.
Citation-based Plagiarism Detection
- PhysicsSpringer Fachmedien Wiesbaden
- 2014
When the author first considered the use of citation information as a method to detect plagiarism, he assumed this concept had already been explored or even integrated into today’s plagiarism…
Survey of Plagiarism Detection Approaches and Big data Techniques related to Plagiarism Candidate Retrieval
- Computer ScienceBDCA'17
- 2017
An overview of the best-known methods of detection of plagiarism that exist is given and the concept of big data is defined as one of these techniques that applied in the phase of extraction of documents sources for plagiarism detection.
Visualizing Feature-based Similarity for Research Paper Recommendation
- Computer Science2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
- 2021
Results from a study with 10 expert users show that the interactive visualization interface proposed can effectively address specialized information retrieval tasks, which cannot be addressed by existing research paper search or recommendation interfaces.
An academic Arabic corpus for plagiarism detection: design, construction and experimentation
- Computer ScienceInternational Journal of Educational Technology in Higher Education
- 2020
The design and construction of an Arabic PD reference corpus that is dedicated to academic language and a database for the detection of plagiarism in student assignments, reports, and dissertations is discussed.
References
SHOWING 1-10 OF 46 REFERENCES
Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence
- Computer ScienceDocEng '11
- 2011
Three algorithms are introduced and it is shown that if these algorithms are combined, common forms of plagiarism can be detected reliably and Greedy Citation Tiling, Citation Chunking and Longest Common Citation Sequence are combined.
Citation based plagiarism detection: a new approach to identify plagiarized work language independently
- Computer ScienceHT '10
- 2010
This approach is based on citation analysis and allows duplicate and plagiarism detection even if a document has been paraphrased or translated, since the relative position of citations remains similar.
Citation Proximity Analysis (CPA) : A New Approach for Identifying Related Work Based on Co-Citation Analysis
- Computer Science
- 2009
The approach called Citation Proximity Analysis (CPA) is a further development of co-citation analysis, but in addition, considers the proximity of citations to each other within an article’s full-text.
Systematic Characterizations of Text Similarity in Full Text Biomedical Publications
- Computer SciencePloS one
- 2010
While quantifying abstract similarity is an effective approach for finding duplicate citations, a comprehensive full text analysis is necessary to uncover all potential duplicate citations in the scientific literature and is helpful when establishing ethical guidelines for scientific publications.
SPLAT: A System for Self-Plagiarism Detection
- Computer ScienceICWI
- 2003
This paper presents a system for self-plagiarism detection, SPLAT. The system uses a WebL web spider that crawls through the web sites of the top fifty Computer Science departments, downloading…
Sentence boundary detection: a comparison of paradigms for improving MT quality
- Computer ScienceMTSUMMIT
- 2001
A comparison of different paradigms for the detection of sentence boundaries in written text is presented: Directly encoding the knowledge in a program, a rule-based system relying on regular expressions to describe boundaries, and a statistical maximum-entropy learning algorithm to obtain knowledge about boundaries.
Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07
- BusinessSIGF
- 2007
Goal of the workshop was to bring together experts and prospective researchers around the exciting and future-oriented topic of plagiarism analysis, authorship identification, and high similarity…
Test cases for plagiarism detection software
- Art
- 2010
A typology of plagiarism, which makes clear that plagiarism is more than just an exact copy, is discussed, and a collection of 42 test cases in German are presented that were developed at the HTW Berlin for testing plagiarism detection software.
dTagger: A POS Tagger
- Computer ScienceAMIA
- 2006
The Lexical Systems Group at the National Library of Medicine (NLM) has developed a Part-of-Speech (POS) tagger to be freely distributed with the SPECIALIST NLP Tools. dTagger is specifically…
Automatically Adapting an NLP Core Engine to the Biology Domain
- Computer Science
- 2006
In the first evaluation ever of a ML-based ensemble of core NLP components in the biology domain, it is demonstrated that the performance of OpenNLP’s sentence splitter, tokenizer, part- of-speech tagger, chunker and parser matches up with state-of-the-art performance figures from the newspaper domain.