Automated evaluation of comments to aid software maintenance

Srijoni Majumdar,Ayush Bansal,Soumya Kanti Ghosh,Partha Pratim Das,Paul D Clough,Kausik Datta

doi:10.1002/smr.2463

Abstract

AbstractApproaches to evaluate comments based on whether they increase code comprehensibility for software maintenance tasks are important, but largely missing. We propose Comment for automated classification and quality evaluation of code comments of C codebases based on how they can help to understand existing code. We conduct surveys and document developers' perceptions on the type of comments that prove useful to maintaining software in the form of comment categories. A total of 20,206 comments have been collected from open‐source Github projects and annotated with assistance from industry experts. We develop features to semantically analyze comments to locate concepts related to categories of usefulness. Additionally, features based on code and comment correlation are designed to infer whether the comment is also consistent and not superfluous. Using neural networks, comments are classified as useful, partially useful, and not useful with precision and recall scores of 86.27% and 86.42%, respectively. The proposed framework for comment quality evaluation incorporates industry practices and adds significant value to companies wanting to formulate better code commenting strategies. Furthermore, large codebases can be de‐cluttered by removing comments not helpful in maintaining code.

Full Text