Abstract

In this work, the primary strategies employed for code plagiarism were explored, alongside an analysis of prevalent methods for detecting copied content. Based on the results of the analysis of various approaches, as well as the analysis of the subject area itself and on the basis of the formulated requirements, a new System for automatically checking software similarity for plagiarism was successfully designed, implemented and tested. When developing the System, an aggregated approach was used, which made it possible to use several basic similarity detection algorithms. Namely, the Greedy Row Tiling algorithm and the Sifting algorithm. Since the System is designed for programmers, in particular, for teachers, and also with the possibility of local launch, it is proposed to perform user interaction with the System in the form of a command line interface. The System is implemented in Python, which ensures that the suggested System is platform independent. Regular expressions are used to implement preprocessing and exclusion functions, and the libclang library is used for С++ code parsing and tokenization functions. Promising applications for the developed System include education and programming competitions. So universities and colleges can use the System to check code written by students to detect plagiarism. And in competitive environments such as hackathons or programming competitions, the System can be used to ensure fairness and prevent plagiarism among participants.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call