Abstract

Software defect prediction technology can effectively detect potential defects in the software system. The most common method is to establish machine learning models based on software metrics for prediction. However, most of the prediction models are proposed without considering the confounding effects of size metric. The size metric has unexpected correlations with other software metrics and introduces biases into prediction results. Suitably removing these confounding effects to improve the prediction model’s performance is an issue that is still largely unexplored. This paper proposes a method that can causally remove the negative confounding effects of size metric. First, we quantify the confounding effects based on a causal graph. Then, we analyze each confounding effect to determine whether they are positive or negative, and only the negative confounding effects are removed. Extensive experimental results on eight data sets demonstrate the effectiveness of our proposed method. The prediction model’s performance can, in general, be improved after removing the negative confounding effects of size metric.

Highlights

  • With the development of information technology, various software systems make people’s daily lives highly informative

  • The experimental data come from eight cleaned data sets of the metric data program (MDP) repository

  • The MDP is commonly used in the field of software defect prediction

Read more

Summary

Introduction

With the development of information technology, various software systems make people’s daily lives highly informative. These software systems were closely related to the country’s economic revitalization and social development, and ensuring the quality of the software system is crucial. Software defects are an essential factor that affects software system quality [1,2], and software developers should search for software defects to improve the software system quality [3]. The current software development process is often agile. Software defect prediction can effectively predict potential software defects, allowing testers to devote more resources to software modules that are more likely to have defects

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call