Semantic Clone Detection Based on Code Feature Fusion Learning

Qianjin Zhang,Yawen Wang,Yunzhan Gong,Dahai Jin

doi:10.1142/s0218194023500249

Abstract

Code clones are duplicated code snippets that significantly threaten software maintenance and the public corpora of code representation learning. Traditionally, code context and its structure information abstract syntax tree (AST), control flow graph (CFG) are typical representations of source code, and context-based models and structure-based models contributed significantly to the development of code clone detection. In this paper, we present a hybrid embedding model for code clone detection (HEM-CCD), a fusion method of token sequential information and graph-based structure information. We insert tokens’ global context information encoded by a bi-directional recurrent neural network into the AST-based graph for comprehensive code semantic representation. Then, feeding the graph into a gated graph neural network we generate code semantic vectors for similarity evaluation. We have implemented our model on two public clone datasets (BigCloneBench and GoogleCodeJam), and the results indicate that HEM-CCD outperforms several state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semantic Clone Detection Based on Code Feature Fusion Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Similar Papers

Java Code Clone Detection by Exploiting Semantic and Syntax Information From Intermediate Code-Based Graph
Dawei Yuan ... Tao Zhang
IEEE Transactions on Reliability | VOL. 72
Dawei Yuan, et. al.Dawei Yuan ... Tao Zhang
01 Jun 2023
IEEE Transactions on Reliability | VOL. 72

Combining Holistic Source Code Representation with Siamese Neural Networks for Detecting Code Clones
Smit Patel ... Roopak Sinha
-
Smit Patel, et. al.Smit Patel ... Roopak Sinha
01 Jan 2021
01 Jan 2021

SCCD-GAN: An Enhanced Semantic Code Clone Detection Model Using GAN
Kun Xu ... Yan Liu
-
Kun Xu, et. al.Kun Xu ... Yan Liu
17 Dec 2021
17 Dec 2021

Parallel and Distributed Code Clone Detection using Sequential Pattern Mining
Ali El-Matarawy ... Reem Bahgat
International Journal of Computer Applications | VOL. 62
Ali El-Matarawy, et. al.Ali El-Matarawy ... Reem Bahgat
18 Jan 2013
International Journal of Computer Applications | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic Clone Detection Based on Code Feature Fusion Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering