1. Comparison of the BM25 and rabinkarp algorithm for plagiarism detection
- Author
-
I N S W Wijaya, K A Seputra, and Wayan Gede Suka Parwita
- Subjects
History ,Computer science ,business.industry ,Pattern recognition ,Plagiarism detection ,Artificial intelligence ,business ,Computer Science Applications ,Education - Abstract
Plagiarism occurs because of the easy distribution of data. Plagiarism detection of documents such as student assignments and final projects requires a long process, often overlooked. However, to avoid plagiarism, a document must be checked for the level of plagiarism. Plagiarism detection can be done online / offline with the plagiarism checker. However, checking documents with plagiarism checkers such as Turnitin, Dupli Checker, Copyleaks, PaperRater, Grammarly and others requires additional fees. Several studies have been conducted to detect plagiarism. BM25 and Rabin Karp are examples of the Plagiarism Checker method. BM25 is tfidf based, while Rabin Karp is Hashing based. Each method needs to know its performance to detect plagiarism. Based on these problems, a study on the comparison of plagiarism detection with the BM25 algorithm with Rabin-Karp will be conducted. The case study is to use the article in Indonesian. The application of the BM25 and Rabin Karp algorithms goes through the Pre-Processing stage which consists of case folding, cleaning, tokenizing, filtering, and stemming. In this study, Sastrawi stemmer was used in this study. The test was conducted on twenty articles in Indonesian. The test results that are seen are the performance in the form of execution time.
- Published
- 2021
- Full Text
- View/download PDF