Abstract

This paper mainly analyzes the characteristics of software defect prediction from the perspective of machine learning, and proposes a semi-supervised software defect prediction method based on sampling and integration for the problem of class imbalance in software defect data and the incomplete classification of data sets. SISDP). SISDP firstly constructs a robust KNN marking model by taking a balanced sample of samples to mark a batch of unmarked data, and then iteratively adds the newly marked data to the original data set for the next marking model. , iterate until the data is marked. For the marked data set, the hybrid sampling algorithm is used to obtain the training set, and the integrated classification model composed of the multi-classification algorithm is classified and trained. SISDP not only reduces the interference of a few classes on the marking process, but also improves the generalization ability of the defect prediction model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call