Abstract

Software defect prediction (SDP) has been a prominent area of research in software engineering. Previous SDP methods often struggled in industrial applications, primarily due to the need for sufficient historical data. Thus, clustering‐based unsupervised defect prediction (CUDP) and cross‐project defect prediction (CPDP) emerged to address this challenge. However, the former exhibited limitations in capturing semantic and structural features, while the latter encountered constraints due to differences in data distribution across projects. Therefore, we introduce a novel framework called improved clustering with graph‐embedding‐based features (IC‐GraF) for SDP without the reliance on historical data. First, a preprocessing operation is performed to extract program dependence graphs (PDGs) and mark distinct dependency relationships within them. Second, the improved deep graph infomax (IDGI) model, an extension of the DGI model specifically for SDP, is designed to generate graph‐level representations of PDGs. Finally, a heuristic‐based k‐means clustering algorithm is employed to classify the features generated by IDGI. To validate the efficacy of IC‐GraF, we conduct experiments based on 24 releases of the PROMISE dataset, using F‐measure and G‐measure as evaluation criteria. The findings indicate that IC‐GraF achieves 5.0%−42.7% higher F‐measure, 5%−39.4% higher G‐measure, and 2.5%−11.4% higher AUC over existing CUDP methods. Even when compared with eight supervised learning‐based SDP methods, IC‐GraF maintains a superior competitive edge.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.