CodeGen-Search: A Code Generation Model Incorporating Similar Sample Information

Hongwei Li,Yingjian Xiao,Ganlin Liu,Maosheng Zhong,Gen Liu,Jiangling Kuang,Zhixiang Wang

doi:10.1142/s0218194023500584

Abstract

Code generation has a positive significance in supporting software development, reducing labor intensity, and improving development efficiency. Some scholars use similar code information to enhance the quality of code generation. However, to improve the efficiency and accuracy of programming in daily development tasks, developers often search for similar samples as references. They get the code’s syntactic structure and semantic information from similar samples to assist in programming development. Inspired by this, we argue that similar samples are helpful for code generation. This paper proposes a CodeGen-Search model to improve code generation quality by incorporating similar samples. To fully utilize the information of similar samples, the model adopts the “pre-training [Formula: see text] fine-tuning” pattern. The model uses a minimum edit distance algorithm to find some similar samples with natural language (NL), and uses different encoders to extract the features of the NL and the code in similar samples. Experimental results show that our model efficiently improves the quality of the generated code. Compared to the state-of-the-art model, the CodeGen-Search model improves the BLEU by 1.5%, the Rough by 0.8% on the HS dataset, and the StrAcc by 0.5% on the ATIS dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CodeGen-Search: A Code Generation Model Incorporating Similar Sample Information

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Similar Papers

Research on Machine Learning Program Generation Algorithm Based on AORBCO
Shiqian Wang ... Wuqi Gao
International Journal of Advanced Network, Monitoring and Controls | VOL. 9
Shiqian Wang, et. al.Shiqian Wang ... Wuqi Gao
01 Jun 2024
International Journal of Advanced Network, Monitoring and Controls | VOL. 9

An illumination of the template enigma : software code generation with templates

-

18 Nov 2015
18 Nov 2015

Perspectives on Information Structure in Austronesian Languages
...
-
, et. al. ...
01 Apr 2020
01 Apr 2020

SSEGCN: Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis

-

27 Jun 2022
27 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CodeGen-Search: A Code Generation Model Incorporating Similar Sample Information

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering