Hierarchical semantic-aware neural code representation

Yuan Jiang,Xiaohong Su,Christoph Treude,Tiantian Wang

doi:10.1016/j.jss.2022.111355

Abstract

Code representation is a fundamental problem in many software engineering tasks. Despite the effort made by many researchers, it is still hard for existing methods to fully extract syntactic, structural and sequential features of source code, which form the hierarchical semantics of the program and are necessary to achieve a deeper code understanding. To alleviate this difficulty, we propose a new supervised approach based on the novel use of Tree-LSTM to incorporate the sequential and the global semantic features of programs explicitly into the representation model. Unlike previous techniques, our proposed model can not only learn low-level syntactic information within each statement but also the high-level semantic information between statements over the constructed semantic graph. Besides, considering that the sequential semantics is also critical for developers to understand the dependency path and data flow transmission, we propose a DFS-based method to generate the topological order of statements being processed, and then feed them as well as their in-neighboring information and syntactic embeddings into the proposed model to learn richer statement-level semantic features. Extensive experiments on multiple program comprehension tasks, e.g., code clone detection, demonstrate that our method achieves promising performance compared with other existing baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hierarchical semantic-aware neural code representation

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software

Lead the way for us

Journal: Journal of Systems and Software	Publication Date: May 6, 2022
Citations: 9

Similar Papers

Combining Holistic Source Code Representation with Siamese Neural Networks for Detecting Code Clones
Smit Patel ... Roopak Sinha
-
Smit Patel, et. al.Smit Patel ... Roopak Sinha
01 Jan 2021
01 Jan 2021

A Mocktail of Source Code Representations
Dheeraj Vagavolu ... Karthik Chandra Swarna
-
Dheeraj Vagavolu, et. al.Dheeraj Vagavolu ... Karthik Chandra Swarna
01 Nov 2021
01 Nov 2021

XCode : Towards Cross-Language Code Representation with Large-Scale Pre-Training
Zehao Lin ... Xiangji Zeng
ACM Transactions on Software Engineering and Methodology | VOL. 31
Zehao Lin, et. al.Zehao Lin ... Xiangji Zeng
09 Apr 2022
ACM Transactions on Software Engineering and Methodology | VOL. 31

Java Code Clone Detection by Exploiting Semantic and Syntax Information From Intermediate Code-Based Graph
Dawei Yuan ... Tao Zhang
IEEE Transactions on Reliability | VOL. 72
Dawei Yuan, et. al.Dawei Yuan ... Tao Zhang
01 Jun 2023
IEEE Transactions on Reliability | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hierarchical semantic-aware neural code representation

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software