NMT-Based Code Generation for Coding Assistance with Natural Language

Yuka Akinobu,Momoka Obara,Kimio Kuramitsu,Teruno Kajiura

doi:10.2197/ipsjjip.30.443

Yuka Akinobu, Momoka Obara + Show 2 more

Open Access

https://doi.org/10.2197/ipsjjip.30.443

Copy DOI

Abstract

This paper proposes an attempt to realize coding assistance that generates Python code from natural language descriptions using neural machine translation. Although coding assistance with deep learning has recently become a major concern, few applications have used neural machine translation models. One of the major barriers is the shortage of a parallel corpus of natural language descriptions and source code. To overcome the shortage of parallel corpora, we propose a method for synthesizing parallel corpora that utilizes the formal nature of programming languages. We aim to establish a new method using an abstract syntax tree (AST) and a corpus of code fragments. Using the proposed synthesis method, we successfully constructed tens of thousands of parallel corpora and trained PyNMT models to generate Python code from Japanese input sentences. The trained PyNMT models successfully predicted the Python code from user input sentences with an accuracy of 28%. In this study, we propose a synthetic method for a parallel corpus and summarize the results of the evaluation experiments conducted on PyNMT models.

Full Text