Abstract

This paper proposes an attempt to realize coding assistance that generates Python code from natural language descriptions using neural machine translation. Although coding assistance with deep learning has recently become a major concern, few applications have used neural machine translation models. One of the major barriers is the shortage of a parallel corpus of natural language descriptions and source code. To overcome the shortage of parallel corpora, we propose a method for synthesizing parallel corpora that utilizes the formal nature of programming languages. We aim to establish a new method using an abstract syntax tree (AST) and a corpus of code fragments. Using the proposed synthesis method, we successfully constructed tens of thousands of parallel corpora and trained PyNMT models to generate Python code from Japanese input sentences. The trained PyNMT models successfully predicted the Python code from user input sentences with an accuracy of 28%. In this study, we propose a synthetic method for a parallel corpus and summarize the results of the evaluation experiments conducted on PyNMT models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.