Abstract

Code clones introduce difficulties in software maintenance and cause bug propagation. We propose a framework for detecting Java code obfuscation and both syntactic and semantic clones by adding cluster data which is using the sequential information bottleneck algorithm with (CNN) deep learing classification, called CCDLC. The CCDLC uses a novel Java bytecode dependency graph (BDG) along with program dependency graph (PDG) and abstract syntax tree (AST) features. We use several publicly available code clone and Java obfuscated code datasets for validating effectiveness of our framework. Our experimental results and evaluation indicate that using the combination of clustering and deep learning classification is a viable methodology, since they improve detecting clones and obfuscation code on the corpus. The key benefit of this approach is that our tool can improve detecting obfuscation accuracy about 5.44% and improve finding both Syntactic and Semantic clones accuracy about 12%

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call