Abstract

The purpose of this research is to recognize the types of plagiarism that occurred in basic programming courses as a preliminary research in developing an intelligent system to detect source plagiarism in programming class. This research use a hierarchical clustering method which is implemented in term-document matrices. In this research, we propose the use of keywords in the programming language informing the term-document matrix. The result is normalized and then we use Euclidean distance to normalize data for similarity measurement. The result shows that this method is quite effective to classify objects of data indicated by plagiarism. Some types of plagiarism which are mostly occurred are formatting source code, changing comment, renaming identifier, and adding statements or redundant variables.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call