NeuralDoc-Automating Code Translation Using Machine Learning

Sai Sree Harsha,Aditya Chandrashekhar Sohoni,K Chandrasekaran

doi:10.1007/978-981-16-6940-8_11

Abstract

AbstractSource code documentation is the process of writing concise, natural language descriptions of how the source code behaves during run time. In this work, we propose a novel approach called NeuralDoc, for automating source code documentation using machine learning techniques. We model automatic code documentation as a language translation task, where the source code serves as the input sequence, which is translated by the machine learning model to natural language sentences depicting the functionality of the program. The machine learning model that we use is the Transformer, which leverages the self-attention and multi-headed attention features to effectively capture long-range dependencies and has been shown to perform well on a range of natural language processing tasks. We integrate the copy attention mechanism and incorporate the use of BERT, which is a pre-training technique into the basic Transformer architecture to create a novel approach for automating code documentation. We build an intuitive interface for users to interact with our models and deploy our system as a web application. We carry out experiments on two datasets consisting of Java and Python source programs and their documentation, to demonstrate the effectiveness of our proposed method.KeywordsProgram comprehensionAutomatic documentationNeural machine translationTransformerBERTSoftware engineering

Full Text