A Survey of Automatic Source Code Summarization

Chunyan Zhang,Junchao Wang,Fudong Liu,Hairen Gui,Ting Xu,Ke Tang,Qinglei Zhou

doi:10.3390/sym14030471

Abstract

Source code summarization refers to the natural language description of the source code’s function. It can help developers easily understand the semantics of the source code. We can think of the source code and the corresponding summarization as being symmetric. However, the existing source code summarization is mismatched with the source code, missing, or out of date. Manual source code summarization is inefficient and requires a lot of human efforts. To overcome such situations, many studies have been conducted on Automatic Source Code Summarization (ASCS). Given a set of source code, the ASCS techniques can automatically generate a summary described with natural language. In this paper, we give a review of the development of ASCS technology. Almost all ASCS technology involves the following stages: source code modeling, code summarization generation, and quality evaluation. We further categorize the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings. We also draw a clear map on the development of the existing algorithms.

Highlights

Code summarization, called code comment, is a text description for the function and purpose of special identifiers in computer programs
We conducted an in-depth analysis of Automatic Source Code Summarization (ASCS): (1) We outlined the core of the paper, which consists of the current challenges, and systematized the ASCS based on three dimensions: source code analysis, code summarization generation algorithms, and the evaluation methodologies used to evaluate them
(3) We summarized the effective evaluation mechanism of ASCS, and analyzed the recent evaluation methods

Summary

Introduction

Called code comment, is a text description for the function and purpose of special identifiers in computer programs. The quality evaluation methods of NLP are used for code summarization, but the source code is different from natural language text. According to the technique development, we summarize the work from three aspects: source code modeling, automatic code summarization algorithms, and the summarizaiton quality evaluation. This survey makes the following contributions to the field:. Almost all source code modeling uses machine learning, and paper [45] can be used as a reference It carried out an extensive literature search and identified 364 primary studies published between 2002 and 2021, aiming to summarize the current knowledge in the area of applied machine learning for source code analysis. The quality evaluation measures the pros and cons of ASCS algorithms through the generated code summarization.

Source Code Modeling

Token-Based Source Code Model

Graph-Based Source Code Model

Other Source Code Models

Code Summarization Generation

Manually-Crafted Templates-Based ASCS Generation

IR-Based ASCS Generation

DL-Based ASCS Generation

Quality Evaluation

Datasets

Methods

Automatic Evaluation Mechanism

Human Evaluation

Findings

Discussion and Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Feb 25, 2022
Citations: 22	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Survey of Automatic Source Code Summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Improved Code Summarization via a Graph Neural Network
Alexander Leclair ... Sakib Haque
-
Alexander Leclair, et. al.Alexander Leclair ... Sakib Haque
13 Jul 2020
13 Jul 2020

Automatic Source Code Summarization with Extended Tree-LSTM
Yusuke Shido ... Akihiro Yamamoto
-
Yusuke Shido, et. al.Yusuke Shido ... Akihiro Yamamoto
01 Jul 2019
01 Jul 2019

Function Call Graph Context Encoding for Neural Source Code Summarization
Aakash Bansal ... Collin Mcmillan
IEEE Transactions on Software Engineering | VOL. 49
Aakash Bansal, et. al.Aakash Bansal ... Collin Mcmillan
01 Sep 2023
IEEE Transactions on Software Engineering | VOL. 49

Action Word Prediction for Neural Source Code Summarization
Sakib Haque ... Aakash Bansal
-
Sakib Haque, et. al.Sakib Haque ... Aakash Bansal
01 Mar 2021
01 Mar 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Automatic Source Code Summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry