Abstract

Source code comments are a cornerstone of software documentation facilitating feature development and maintenance. Well-defined documentation formats, like Javadoc, make it easy to include structural metadata used to, for example, generate documentation manuals. However, the actual usage of structural elements in source code comments has not been studied yet. We investigate to which extent these structural elements are used in practice and whether the added information can be leveraged to improve tools assisting developers when writing comments. Existing research on comment generation traditionally focuses on automatic generation of summaries. However, recent works have shown promising results when supporting comment authoring through a next-word prediction. In this paper, we present an in-depth analysis of commenting practice in more than 18K open-source projects written in Python and Java showing that many structural elements, particularly parameter and return value descriptions are indeed widely used. We discover that while a majority are rather short at about 6 to 9 words, many are several hundred words in length. We further find that Python comments tend to be significantly longer than Java comments, possibly due to the weakly-typed nature of the former. Following the empirical analysis, we extend an existing language model with support for structural information, substantially improving the Top-1 accuracy of predicted words (Python 9.6%, Java 7.8%).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.