Flow Chart Generation-Based Source Code Similarity Detection Using Process Mining

Feng Zhang,Lulu Li,Cong Liu,Qingtian Zeng

doi:10.1155/2020/8865413

Feng Zhang, Lulu Li + Show 2 more

Open Access

https://doi.org/10.1155/2020/8865413

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Source code similarity detection has extensive applications in computer programming teaching and software intellectual property protection. In the teaching of computer programming courses, students may utilize some complex source code obfuscation techniques, e.g., opaque predicates, loop unrolling, and function inlining and outlining, to reduce the similarity between code fragments and avoid the plagiarism detection. Existing source code similarity detection approaches only consider static features of source code, making it difficult to cope with more complex code obfuscation techniques. In this paper, we propose a novel source code similarity detection approach by considering the dynamic features at runtime of source code using process mining. More specifically, given two pieces of source code, their running logs are obtained by source code instrumentation and execution. Next, process mining is used to obtain the flow charts of the two pieces of source code by analyzing their collected running logs. Finally, similarity of the two pieces of source code is measured by computing the similarity of these two flow charts. Experimental results show that the proposed approach can deal with more complex obfuscation techniques including opaque predicates and loop unrolling as well as function inlining and outlining, which cannot be handled by existing work properly. Therefore, we argue that our approach can defeat commonly used code obfuscation techniques more effectively for source code similarity detection than the existing state-of-the-art approaches.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Programming	Publication Date: Jul 7, 2020
Citations: 5	License type: CC BY 4.0

R Discovery Prime

Flow Chart Generation-Based Source Code Similarity Detection Using Process Mining

Abstract

Published Version

Talk to us

Similar Papers

More From: Scientific Programming

Lead the way for us

Similar Papers

Flowchart-Based Cross-Language Source Code Similarity Detection
Feng Zhang ... Guofan Li
Scientific Programming | VOL. 2020
Feng Zhang, et. al.Feng Zhang ... Guofan Li
17 Dec 2020
Scientific Programming | VOL. 2020

Automatic Code Review by Learning the Revision of Source Code
Shu-Ting Shi ... Xuan Huo
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33
Shu-Ting Shi, et. al.Shu-Ting Shi ... Xuan Huo
17 Jul 2019
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33

Identification of high-level concept clones in source code
A Marcus ... J.I Maletic
-
A Marcus, et. al.A Marcus ... J.I Maletic
26 Nov 2001
26 Nov 2001

Using Stack Overflow content to assist in code review
Shipra Sharma ... Balwinder Sodhi
Software: Practice and Experience | VOL. 49
Shipra Sharma, et. al.Shipra Sharma ... Balwinder Sodhi
27 May 2019
Software: Practice and Experience | VOL. 49

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Flow Chart Generation-Based Source Code Similarity Detection Using Process Mining

Abstract

Published Version

Talk to us

Similar Papers

More From: Scientific Programming