Abstract

Source-code plagiarism detection in programming, concerns the identification of source-code files that contain similar and/or identical source-code fragments. Based on the analysis of the characteristics and defects of the existing program code similarity detection system, a method of source code similarity detection based on Abstract Implementation Structure Diagram (AISD) is proposed. The source code modelling and format into an abstract implementation structure diagram, and forming structural feature strings and variable reference relationship sequences by extracting structural features and variable position features. We calculate the overall similarity by calculating structural similarity and variable similarity. The results demonstrate that the performance of the proposed AISD-based approach overcomes other approaches on the same source code datasets, and reveals promising results as an efficient and reliable approach to source-code plagiarism detection.

Highlights

  • Plagiarism of source-code is a growing problem due to the growth of source-code repositories, and digital documents found on the Internet

  • Liang [3] has done a series of research work on repeated code detection based on process blueprint, pointing out the repeated code detection method based on process blueprint, avoiding the complicated process of transforming source code into suffix tree and reducing its complexity

  • Since the program statement feature contains the hierarchical relationship of the program action, it represents the positional relationship between the nodes in the abstract implementation structure diagram

Read more

Summary

Introduction

Plagiarism of source-code is a growing problem due to the growth of source-code repositories, and digital documents found on the Internet. In the field of computer science education, the phenomenon of students copying each other is widespread, which seriously affects the cultivation of students' abilities. About 33% of students in recent foreign studies admitted to having plagiarism[1]. Plagiarism has seriously affected the quality of computer science education. In order to curb bad academic style, research scholars have become increasingly necessary to study code plagiarism detection methods

The Abstract
Related work
Detection process and algorithms
Construction of structural feature strings
Variable reference sequence construction
Similarity calculation
Datasets and Metrics
Performance evaluation measures for plagiarism detection
Experiment result
Findings
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call