CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Ensheng Shi ,Yanlin Wang

doi:10.48448/zyqs-wm08

Abstract

Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract syntax trees (ASTs) is conducive to summary generation. However, existing approaches fail to fully capture the rich information in ASTs because of the large size/depth of ASTs. In this paper, we propose a novel model CAST that hierarchically splits and reconstructs ASTs. First, we hierarchically split a large AST into a set of subtrees and utilize a recursive neural network to encode the subtrees. Then, we aggregate the embeddings of subtrees by reconstructing the split ASTs to get the representation of the complete AST. Finally, AST representation, together with source code embedding obtained by a vanilla code token encoder, is used for code summarization. Extensive experiments, including the ablation study and the human evaluation, on benchmarks have demonstrated the power of CAST.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees
Ensheng Shi ... Dongmei Zhang
-
Ensheng Shi, et. al.Ensheng Shi ... Dongmei Zhang
01 Jan 2020
01 Jan 2020

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning
Wei Ye ... Jinglei Zhang
-
Wei Ye, et. al.Wei Ye ... Jinglei Zhang
20 Apr 2020
20 Apr 2020

A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code Summarization
Zachary Karas ... Yifan Zhang
ACM Transactions on Software Engineering and Methodology | VOL. -
Zachary Karas, et. al.Zachary Karas ... Yifan Zhang
15 May 2024
ACM Transactions on Software Engineering and Methodology | VOL. -

Summarizing Source Code with Transferred API Knowledge
Xing Hu ... David Lo
-
Xing Hu, et. al.Xing Hu ... David Lo
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Abstract

Talk to us

Similar Papers