Abstract

Cross-project defect prediction (CPDP) refers to identifying defect-prone software modules in one project (target) using historical data collected from other projects (source), which can help developers find bugs and prioritize their testing efforts. Recently, CPDP has attracted great research interest. However, the source and target data usually exist redundancy and nonlinearity characteristics. Besides, most CPDP methods do not exploit source label information to uncover the underlying knowledge for label propagation. These factors usually lead to unsatisfactory CPDP performance. To address the above limitations, we propose a landmark selection-based kernelized discriminant subspace alignment (LSKDSA) approach for CPDP. LSKDSA not only reduces the discrepancy of the data distributions between the source and target projects, but also characterizes the complex data structures and increases the probability of linear separability of the data. Moreover, LSKDSA encodes label information of the source data into domain adaptation learning process and makes itself with good discriminant ability. Extensive experiments on 13 public projects from three benchmark datasets demonstrate that LSKDSA performs better than a range of competing CPDP methods. The improvement is 3.44%-11.23% in g-measure, 5.75%-11.76% in AUC, and 9.34%-33.63% in MCC, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.