Abstract

Software defect prediction is one of the hot research topics in the software engineering application. The performance of predictor largely depends on the quality of dataset used for learning the predictor. High dimensionality is a noteworthy characteristic of software defect dataset, which has some side-effect on the predictor building using data mining or machine learning algorithm. Feature selection, being an effective measure of dimensionality reduction, uses the optimal feature subset to represent the entire feature space and alleviate the dimensionality curse problem. In this paper, a wrapper feature selection approach applying genetic algorithm as a search strategy to find the optimal feature subset is firstly introduced. Secondly, an improved isolation forest based defect prediction method is proposed. The exploring experiments on 5 real NASA software defect datasets demonstrate the proposed method can improve the defect prediction performance to some extent and proves the positive effect of feature selection in SDP application.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call