Abstract

Repeated calculations lead to a sharp increase in the time of correlation-based feature selection. Incremental iteration has been applied in some algorithms to improve the efficiency. However, the computational efficiency of correlation has generally be ignored. An algorithm acceleration framework for correlation-based feature selection (AFCFS) is proposed. In AFCFS, the criterion of the feature selection will be analyzed and reconstructed based on entropy granularity, and the algorithm structure will also be adjusted accordingly. Specifically, all repeated part of calculation will be saved in mapping tables and can be accessed in next time directly, so as to further reduce the calculation repetition rate and improve the efficiency. The experimental results show that AFCFS can greatly reduce the cost time of these algorithms, and keep the corresponding classification accuracy basically unchanged.

Highlights

  • Correction-based feature selection has been widely used in software defect prediction to construct the feature subset due to its simple principle and good stability

  • A few researchers have realized the expensive time cost of the feature selection based on correlation and optimized their algorithm structure such as Feature selection with redundancy-complementariness dispersion (RCDFS)[2], Interaction Weight based Feature Selection algorithm (IWFS) [3], and fast greedy feature selection algorithm (FGS_KDE) [4].most of them only optimize the number of iterations by incremental iteration or weight update, and pay little attention on the calculation process of correlation itself

  • In order to avoid the influence of classifiers, two typical classifier algorithms, K-Nearest Neighbor (KNN) and Naive Bayes (NB) classifiers, are used

Read more

Summary

Introduction

Correction-based feature selection has been widely used in software defect prediction to construct the feature subset due to its simple principle and good stability. A few researchers have realized the expensive time cost of the feature selection based on correlation and optimized their algorithm structure such as Feature selection with redundancy-complementariness dispersion (RCDFS)[2], Interaction Weight based Feature Selection algorithm (IWFS) [3], and fast greedy feature selection algorithm (FGS_KDE) [4].most of them only optimize the number of iterations by incremental iteration or weight update, and pay little attention on the calculation process of correlation itself. The repeated calculations can be avoided to the greatest extent, so as to improve the operating efficiency without changing the performance of the algorithm itself

Correlation-based feature selection
Optimization on number of iterations
Optimization on correlation calculations
Experiment design
Experiment result and analysis
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.