Abstract

We present a semi-supervised approach for software defect prediction. The proposed method is designed to address the special problematic characteristics of software defect datasets, namely, lack of labeled samples and class-imbalanced data. To alleviate these problems, the proposed method features the following components. Being a semi-supervised approach, it exploits the wealth of unlabeled samples in software systems by evaluating the confidence probability of the predicted labels, for each unlabeled sample. And we propose to jointly optimize the classifier parameters and the dictionary by a task-driven formulation, to ensure that the learned features (sparse code) are optimal for the trained classifier. Finally, during the dictionary learning process we take the different misclassification costs into consideration to improve the prediction performance. Experimental results demonstrate that our method outperforms several representative state-of-the-art defect prediction methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.