Abstract
The comprehension of source code is very difficult, especially if the programmer is not familiar with the programming language. Pseudocode explains and describes code contents that are based on the semantic analysis and understanding of the source code. In this paper, a novel retrieval-based transformer pseudocode generation model is proposed. The proposed model adopts different retrieval similarity methods and neural machine translation to generate pseudocode. The proposed model handles words of low frequency and words that do not exist in the training dataset. It consists of three steps. First, we retrieve the sentences that are similar to the input sentence using different similarity methods. Second, pass the source code retrieved (input retrieved) to the deep learning model based on the transformer to generate the pseudocode retrieved. Third, the replacement process is performed to obtain the target pseudo code. The proposed model is evaluated using Django and SPoC datasets. The experiments show promising performance results compared to other language models of machine translation. It reaches 61.96 and 50.28 in terms of BLEU performance measures for Django and SPoC, respectively.
Highlights
Converting source code to pseudocode is a sub-task of semantic analysis
The retrieval mechanism based on NMT [15] is added to the proposed model to deal with low-frequency tokens that do not exist in the training dataset
After studying the models that are compared with the proposed model, we find the Recurrent Neural Network (RNN) model has two limitations
Summary
Converting source code to pseudocode is a sub-task of semantic analysis This conversion is considered one of the problems of converting code to Natural Language (NL). Descriptions [1,2,3,4] It is a challenging problem because the input and the output are different in structure, syntax, and grammar. To solve this problem, there are several possible approaches to be followed. The use of neural networks [5,6,7] may solve this problem. This solution does not ensure that the results are structurally correct. There are several approaches to applying the MT such as Statistical Machine Translation (SMT) [3,8] and Neural Machine
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have