Abstract
Different from existing cross-project defection prediction(CPDP) problems which assume that there is a close relation between the source data sets and the target data sets, in the heterogenous cross-project defection prediction(HCPDP) problem, the target data sets can be totally different from the source data sets. In order to narrow the difference between source data sets and target data sets, we implemented our own algorithm SLA + based on the selective learning algorithm . We select one of the multiple sources that have the highest similarity to the target data set as the source data set, and select one or more of the other source data sets that are similar to both the target data set and the source data set as an intermediate domain. We set up a bridge between the target domain and the source domain through the intermediate domain , breaking the large distribution gap for transferring knowledge between the source domain and the target domain. Besides, we achieve the purpose of dimensionality reduction by mining the potential relationship between features. We have done experiments on open source data sets, and the data sets used are all heterogeneous. The experiments prove that our method achieves comparable results compared with state-of-the-art HCPDP in most cases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.