Abstract

Software defect detection is essential in software development. Most existing approaches often apply Supervised Machine Learning (SML) techniques for software defect detection. However, SML techniques need to a large number of manual labelling for model training, which is time-consuming and laborious. An alternative solution is to apply UnSupervised Machine Learning (USML) in software defect detection. USML techniques, as an approach without requiring labeled datasets, have been applied for software defect detection. Spectral clustering, as one of approaches in USML, shows the potential performance in software defect detection. The core of spectral clustering is the similarity algorithms, which calculate the similarity between metric values of software entities to detect software defects. Yet, the current studies on spectral clustering-based software defect detection models rarely consider the impact of different similarity algorithms on defect detection results.To address this problem, we construct an empirical study to investigate the impact of similarity algorithms in the spectral clustering-based software defect detection models. We compare the differences of three similarity algorithms, which contains k-nearest neighbours, fully connected, and vector dot product. We conduct experiments on the two real-world data sets of AEEEM and PROMISE, and the experimental results show the fully connected algorithm has better performance than other algorithms in the spectral clustering-based software defect detection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.