Abstract

Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.

Highlights

  • As modern software is getting more complex, it is of great importance to ensure software reliability

  • We investigate the feasibility of using the CodeBERT model for software defect prediction

  • We propose two research questions to investigate the feasibility, efficiency, and prediction patterns of using CodeBERT, a programming language model, for software defect prediction

Read more

Summary

Introduction

As modern software is getting more complex, it is of great importance to ensure software reliability. Hand-crafted metrics have been used in software defect prediction. The same trend appears in software defect prediction because deep learning models are more capable of extracting information from long texts, i.e., source code. The first step is to extract software modules from open-source repositories. The second step is to mark software modules as buggy/clean. The third step is to extract code features from software modules. In deep learning-based software defect prediction, typical code features include character-based, token-based, AST-node-based, AST-tree-based, ASTpath-based, and AST-graph-based features. Used deep learning models in software defect prediction include CNN, LSTM, Transformers, etc. The last step is to use the trained deep learning model for inference, i.e., predict whether a software module is buggy or clean

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call