Performance Comparison of Rabin-Karp Algorithm and Winnowing Algorithm for Document Abstraction Similarity Detection
@article{DwiHartanto2022PerformanceCO, title={Performance Comparison of Rabin-Karp Algorithm and Winnowing Algorithm for Document Abstraction Similarity Detection}, author={Anggit Dwi Hartanto and Yoga Pristyanto and Andy Saputra and Eli Pujastuti and Atik Nurmasani and Ika Asti Astuti}, journal={2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS)}, year={2022}, pages={281-286}, url={https://api.semanticscholar.org/CorpusID:256034568} }
It can be concluded that if you want a model sensitive to similarity, then Winnowing is recommended, however, if processing time is the target, it is recommended to use Rabin-Karp.
One Citation
Measuring of Scientific Document Abstraction Similarity Using Rabin-Karp and Poter Stemmer
- 2023
Computer Science, Education
It is proposed to use stemming Porter stemmer to overcome bi-language problems and the Rabin-Karp + Porter Stemmer model can perform better performance than the Rabin-Karp + Sastrawi Stemmer model, with a difference in similarity values of around 2-3 per cent.
28 References
Best Parameter Selection Of Rabin-Karp Algorithm In Detecting Document Similarity
- 2019
Computer Science, Education
The results showed that the selection of gram values and prime bases affected the processing time in testing the data and the similarity values of the documents being tested.
Comparison of Carp Rabin Algorithm and Jaro-Winkler Distance to Determine The Equality of Sunda Languages
- 2019
Computer Science
There are several similarity detection programs including Turnitin, Eve2, CopyCatchGold, WodCheck, Glatt, Jaro Winkler Distance, Rabin Karp algorithm, which are suitable for long pattern searches.
Rabin Karp And Winnowing Algorithm For Statistics Of Text Document Plagiarism Detection
- 2019
Computer Science
This plagiarism detection system is aiding the action of plagiarism with the similarity of sequences of the two documents compared, which is a basic process that can be further developed to build better detection applications for plagiarism.
Combination of levenshtein distance and rabin-karp to improve the accuracy of document equivalence level
- 2018
Computer Science
The Levenshtein algorithm can be used to replace the hash calculation on the Rabin-Karp algorithm, which is perfect for multiple pattern search.
Document Similarity Detection using Rabin-Karp and Cosine Similarity Algorithms
- 2021
Computer Science, Education
The experiment results show that the Rabin-Karp algorithm with Cosine Similarity can be used to detect the similarity of published manuscripts, especially in Indonesian, with the fastest processing time of seconds and the longest around.
A Plagiarism Detection Algorithm based on Extended Winnowing
- 2017
Computer Science
The method of extending classic Winnowing plagiarism detection algorithm is introduced, which can retain the text location and length information in original document while extracting the fingerprints of a document, so that the locating and marking for plagiarism text fragment are much easier to achieve.
Improved Rabin-Karp Algorithm Using Bloom Filter
- 2022
Computer Science
A modified version of the Rabin-Karp algorithm is presented by using a Bloom filter as preprocessing phase that can early detect pattern absence and maximize speed up and reduce the number of patterns needed for the exact Rabin -Karp matching phase.
Similarity detection design using Winnowing Algorithm as an effort to apply green computing
- 2020
Computer Science, Environmental Science
The application of a Winnowing Algorithm to detect the similarity of proposal documents through the website as an effort to apply green computing on campus.
String Matching based Plagiarism Detection for Document in Bahasa Indonesia
- 2019
Computer Science
String matching is an approach for plagiarism detection in computer sience that uses a “character by character” matching method and generates a percentage of the similarity of the document by calculating the N-Gram result with Dice's Similarity Coefficient.
PlagAL: Plagiarism detection system for Albanian texts
- 2021
Computer Science, Education
A system for identifying cross-language plagiarism and text plagiarism in Albanian and English languages is proposed and is expected to increase the quality and responsibility in universities and educational institutions by monitoring student work through this system.