Abstract

Plagiarism is a serious threat, especially to academic honesty, so a detection system that can analyze various types of documents is needed. This research develops a plagiarism detection system using Optical Character Recognition (OCR) to convert image text into digital text. Rabin – Karp algorithm with rolling hash and Dice Coefficient Similarity is applied to measure similarities between documents. Testing is carried out on .doc, .txt, .jpg files. As a result, the system can detect plagiarism well in clear text and image documents, but accuracy can decrease in low-quality images. In conclusion, the similarity of content, sentence structure, and format affects the degree of similarity, while OCR techniques work effectively even though they are limited to low-quality images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.