CSDA: A novel attention-based LSTM approach for code search

Leiming Ren,Kun Xue,Shinmin Shan,Kai Wang

doi:10.1088/1742-6596/1544/1/012056

Leiming Ren, Kun Xue + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/1544/1/012056

Copy DOI

Abstract

Previous studies have proposed semantic-based approaches for code search over large-scale codebases, which has bridged the gap in understanding the semantics between natural language and source-code language. However, these studies either failed to determine an effective method for semantic representations or lacked the distinction of semantic features. In this study, we propose a novel attention-based LSTM neural network known as CSDA (Code Search based on Description Attention), which can effectively improve the code search performance. The proposed model can focus on different parts of a semantic feature when numerous aspects of a source code snippet are used as input. As opposed to assigning the same weight to different parts of the semantic vector, CSDA takes the semantics of natural language descriptions into account, so that the subtle differences hidden in the code snippet can be discriminated and associated with the corresponding queries. We compare CSDA with the existing state-of-the-art approach CODEnn, which uses a jointly embedding technique for code search. Our experimental evaluation demonstrates that CSDA outperforms previous methods and achieves superior code search performance to CODEnn, with higher success rates and mean reciprocal ranks. This study provides significant insights into the use of semantic representation methods in deep learning-based code search approaches.

Full Text