Code Clone Detection with Hierarchical Attentive Graph Embedding

Xiujuan Ji,Lei Liu,Jingwen Zhu

doi:10.1142/s021819402150025x

Abstract

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Code Clone Detection with Hierarchical Attentive Graph Embedding

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Journal: International Journal of Software Engineering and Knowledge Engineering	Publication Date: Jun 1, 2021
Citations: 10

Similar Papers

Fast Code Clone Detection Based on Weighted Recursive Autoencoders
Jie Zeng ... Xiaowei Li
IEEE Access | VOL. 7
Jie Zeng, et. al.Jie Zeng ... Xiaowei Li
01 Jan 2019
IEEE Access | VOL. 7

Hierarchical and dynamic graph attention network for drug-disease association prediction.
Shuhan Huang ... Jiajia Chen
IEEE Journal of Biomedical and Health Informatics | VOL. PP
Shuhan Huang, et. al.Shuhan Huang ... Jiajia Chen
01 Apr 2024
IEEE Journal of Biomedical and Health Informatics | VOL. PP

SCCD-GAN: An Enhanced Semantic Code Clone Detection Model Using GAN
Kun Xu ... Yan Liu
-
Kun Xu, et. al.Kun Xu ... Yan Liu
17 Dec 2021
17 Dec 2021

Code Clone Detection Method Based on the Combination of Tree-Based and Token-Based Methods
Ryota Ami ... Hirohide Haga
Journal of Software Engineering and Applications | VOL. 10
Ryota Ami, et. al.Ryota Ami ... Hirohide Haga
01 Jan 2017
Journal of Software Engineering and Applications | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Code Clone Detection with Hierarchical Attentive Graph Embedding

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering