Abstract

Natural Language Processing has improved tremendously with the success of Deep Learning. Neural Machine Translation (NMT) has arisen as the most powerful with the power of Deep Learning. The same idea has been recently applied to source code. Code Generation (CG) is the task of generating source code from natural language input. This paper introduces a Python parallel corpus of natural language intent and source code pairs. It also proposes a Code Generation model based on Transformer architecture used for NMT by using code tokenization and code embeddings on the custom parallel corpus. The proposed architecture achieved a good BLEU score of 32.4 and Rouge-L of 82.1, which is on par with natural language translation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call