Abstract

A parameter is a significant variable in determining the calculation result of a method. In Rabin-Karp, several parameters determine the accuracy of this algorithm. The role of the parameter acts as a determinant of the level of similarity of the document. The method ocuuped is Rabin-Karp. It is performed for plagiarism checking. Rabin-Karp works by mapping documents into words (tokenizing). The token formed will be mapped in word snippets (N-Grams) that have the same length. The main parameters that play a role determine the accuracy of similarity, N-Gram, Base, and Modulo. N-Gram length is varied. It is determined based on the target desired. In the modulo section, it uses a specific prime number. N-Gram, Base, and Modulo values have varying results when combined. N-Gram will proceed a Hash calculation that serves to give the value on each piece of the word. The Hash value also depends on the Base and Modulo provided. The combination of these three values determines the accuracy percentage of the document's similarity. The Hash value of both documents generated produces the identical hashes. It is the determinant of the similarity level obtained. The proper combination will improve the calculation accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call