DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Walaa Gad,Waleed Nazih,Anas Alokla,Mustafa Aref,Abdel-Badeeh Salem

doi:10.32604/cmc.2022.019884

Walaa Gad, Waleed Nazih + Show 3 more

Open Access

https://doi.org/10.32604/cmc.2022.019884

Copy DOI

Abstract

Understanding the content of the source code and its regular expression is very difficult when they are written in an unfamiliar language. Pseudo-code explains and describes the content of the code without using syntax or programming language technologies. However, writing Pseudo-code to each code instruction is laborious. Recently, neural machine translation is used to generate textual descriptions for the source code. In this paper, a novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The proposed model uses deep learning which is based on Neural Machine Translation (NMT) to work as a language translator. The DLBT is based on the transformer which is an encoder-decoder structure. There are three major components: tokenizer and embeddings, transformer, and post-processing. Each code line is tokenized to dense vector. Then transformer captures the relatedness between the source code and the matching Pseudo-code without the need of Recurrent Neural Network (RNN). At the post-processing step, the generated Pseudo-code is optimized. The proposed model is assessed using a real Python dataset, which contains more than 18,800 lines of a source code written in Python. The experiments show promising performance results compared with other machine translation methods such as Recurrent Neural Network (RNN). The proposed DLBT records 47.32, 68. 49 accuracy and BLEU performance measures, respectively.

Highlights

In the software development cycle [1], there are many different ways of writing code based on the syntax of the programming language
The proposed Deep Learning-Based Transformer (DLBT) model generates Pseudo-code from the source code based on the Transformer Neural Machine Translation (TNMT)
Pseudo-code has three forms: the manual output using the professional programmer, the output of deep learning-based transformer (DLBT) six-layers, and the last one is the output from DLBT eight-layers

Summary

Introduction

In the software development cycle [1], there are many different ways of writing code based on the syntax of the programming language. The problem in the RNN is the training process, as weights tuning has a very small value which is called vanishing gradient [5], or weight changes to a large value which is called exploding gradient To solve this problem, the Long Short-Term Memory (LSTM) model is used [6] as it uses language modeling [7]. A novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The main goal of the proposed model is to automatically generate the Pseudo-code by avoiding the problem of the vanishing gradient because it has access at each layer of all input tokens.

Related Work

Limitation

Tokenization and Embedding

Dataset Description

Performance Measures

Results

Results Discussion and Interpretation

Conclusions and Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers, Materials & Continua	Publication Date: Jan 1, 2022
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers, Materials & Continua

Lead the way for us

Similar Papers

From Feature to Paradigm: Deep Learning in Machine Translation (Extended Abstract)
Marta R Costa-Jussà
-
Marta R Costa-JussàMarta R Costa-Jussà
01 Jul 2018
01 Jul 2018

A Pragmatic Analysis of Machine Translation Techniques for Preserving the Authenticity of the Sanskrit Language
Nandini Sethi ... Amita Dev
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -
Nandini Sethi, et. al.Nandini Sethi ... Amita Dev
25 Jul 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -

Adaptation in Statistical Machine Translation for Low-resource Domains in English-Vietnamese Language
Nghia-Luan Pham ... Van-Vinh Nguyen
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36
Nghia-Luan Pham, et. al.Nghia-Luan Pham ... Van-Vinh Nguyen
30 May 2020
VNU Journal of Science: Computer Science and Communication Engineering | VOL. 36

A Study of Neural Machine Translation from Chinese to Urdu
Zeeshan Khan
Journal of Autonomous Intelligence | VOL. 3
Zeeshan KhanZeeshan Khan
06 May 2020
Journal of Autonomous Intelligence | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers, Materials &amp; Continua

More From: Computers, Materials & Continua