Abstract

Currently, software defect-prediction technology is being extensively researched in the design of metrics. However, the research objects are mainly limited to coarse-grained entities such as classes, files, and packages, and there is a wide range of defects that are difficult to predict in actual situations. To further explore the information between sequences of method calls and to learn the code semantics and syntactic structure between methods, we generated a method-call sequence that retains the code context structure information and the token sequence representing semantic information. We embedded the token sequence into the method-call sequence and encoded it into a fixed-length real-valued vector. We then built a defect-prediction model based on the transformer, which maps the code-vector representation containing the method-call sequences to a low-dimensional vector space to generate semantic features and syntactic structure features and also predicts the defect density of the method-call sequence. We conducted experiments on 10 open-source projects using the ELFF dataset. The experimental results show that the method-call sequence-level prediction effect is better than the class-level effect, and the prediction results are more stable than those of the method level. The mean absolute error (MAE) value of our approach was 8% lower than that of the other deep-learning methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call