Abstract
There are many different programming languages and each programming language has its own structure or way of writing the code, it becomes difficult to learn and frequently switch between different programming languages. Due to this reason, a person working with multiple programming languages needs to look at documentations frequently which costs time and effort. In the past few years, there have been significant increase in the amount of papers published on this topic, each providing a unique solution to this problem. Many of these papers are based on applying NLP concepts in unique configuration to get the desired results. Some have used AI along with NLP to train the system to generate source-code in specific language, and some have trained the AI directly without pre-processing the dataset with NLP. All of these papers face two problems: a lack of proper dataset for this particular application and each paper can convent natural language into only one specified programming language source-code. This proposed system shows that a language independent solution is a feasible alternate for writing source-code without having full knowledge about a programming language. The proposed system uses Natural Lan-guage Processing to convert Natural Language into programming language-independent pseudo code using custom Named Entity Recognition and save it in XML (eXtensible Markup Language) format which is an intermediate step. Then, using traditional programming, this system converts the generated pseudo code into programming language-dependent source-code. In this paper, another novel method has been proposed to create dataset from scratch using predefined structure that is filled with predefined keywords creating unique combination of training dataset.
Highlights
This proposal shows that a language-independent solution is a feasible alternate for writing source-code without having full knowledge about a programming language
Using XML based pseudo code as an intermediate step makes this method as programming language-independent which solves the major drawbacks in existing research that comes with a rigid commitment to only one programming language
If a person wishes to convert the natural language into some other programming language, they need to duplicate the premade language format and fill it with their desired programming language keywords
Summary
Source-code is a list of human-readable instructions written in particular programming language. The aim of source-code is to check for precise specification, format and rules so that it can be interpreted into machine language [1]. Therefore, source-codes are the fundamentals of a computer program. It is usually written by a programmer or developer that has some training and knowledge of the programming language. The are many independent languages and each has its own distinctive way to writing instructions. Natural Language Interface (NLI) provides a different input method in which users can interact with computer using spoken human language, like English instead of using a graphical user interface (GUI), command line interface (CLI) or computer languages like C and Python [2]. NLI enables the computer
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have