A Deep Learning Approach for a Source Code Detection Model Using Self-Attention

Yao Meng,Long Liu

doi:10.1155/2020/5027198

Abstract

With the development of deep learning, many approaches based on neural networks are proposed for code clone. In this paper, we propose a novel source code detection model At-biLSTM based on a bidirectional LSTM network with a self-attention layer. At-biLSTM is composed of a representation model and a discriminative model. The representation model firstly transforms the source code into an abstract syntactic tree and splits it into a sequence of statement trees; then, it encodes each of the statement trees with a deep-first traversal algorithm. Finally, the representation model encodes the sequence of statement vectors via a bidirectional LSTM network, which is a classical deep learning framework, with a self-attention layer and outputs a vector representing the given source code. The discriminative model identifies the code clone depending on the vectors generated by the presentation model. Our proposed model retains both the syntactics and semantics of the source code in the process of encoding, and the self-attention algorithm makes the classifier concentrate on the effect of key statements and improves the classification performance. The contrast experiments on the benchmarks OJClone and BigCloneBench indicate that At-LSTM is effective and outperforms the state-of-art approaches in source code clone detection.

Highlights

In modern society, the application of computers and software has already pervasively permeated our lives. e necessities of life such as medical care, resources, communication, and public security depend on the running of the software of high quality
ASTNN adopts the strategy of subtree decomposition and recursive encoding, combined with a bidirectional LSTM to leverage the naturalness of statements. e representation model of ASTNN reduces the depth of syntax trees and employs the natural sequence of statements in the code
We have presented a novel source code clone detection model At-Bidirectional LSTM (BiLSTM) based on deep learning with self-attention mechanism, which successfully captures both the syntactic and semantic information of the code in the process of encoding

Summary

Introduction

The application of computers and software has already pervasively permeated our lives. e necessities of life such as medical care, resources, communication, and public security depend on the running of the software of high quality. Scholars [7, 8] try to represent the source code with other techniques such as the latent semantic index [9] and hash mapping [10], in order to improve the detection efficiency of the model. These representation approaches are based on NLP methods, which mean only simple clone pairs, i.e., type 1 and type 2 clones, are detectable. We propose a source code clone detection model At-BiLSTM using bidirectional LSTM with self-attention mechanism, which retains both the syntactics and semantics of code in the process of representation.

Related Work

Preliminaries

Our Proposed Approach

Method Declaration

Experiments

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Complexity	Publication Date: Sep 16, 2020
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Deep Learning Approach for a Source Code Detection Model Using Self-Attention

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Complexity

Lead the way for us

Similar Papers

Evaluation of Deep Learning Models for Multi-Step Ahead Time Series Prediction
Rohitash Chandra ... Rishabh Gupta
IEEE Access | VOL. 9
Rohitash Chandra, et. al.Rohitash Chandra ... Rishabh Gupta
01 Jan 2020
IEEE Access | VOL. 9

Automatic gear shift strategy for manual transmission of mine truck based on Bi-LSTM network
Liyong Wang ... Min Xie
Expert Systems With Applications | VOL. 209
Liyong Wang, et. al.Liyong Wang ... Min Xie
03 Aug 2022
Expert Systems With Applications | VOL. 209

Human gesture recognition under degraded environments using 3D-integral imaging and deep learning.
Gokul Krishnan ... Filiberto Pla
Optics Express | VOL. 28
Gokul Krishnan, et. al.Gokul Krishnan ... Filiberto Pla
19 Jun 2020
Optics Express | VOL. 28

Water quality assessment using Bi-LSTM and computational fluid dynamics (CFD) techniques
Wafa F Alfwzan ... Ibrahim Saleem Alharbi
Alexandria Engineering Journal | VOL. 97
Wafa F Alfwzan, et. al.Wafa F Alfwzan ... Ibrahim Saleem Alharbi
25 Apr 2024
Alexandria Engineering Journal | VOL. 97

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Deep Learning Approach for a Source Code Detection Model Using Self-Attention

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Complexity