Abstract
Repeated calculations lead to a sharp increase in the time of correlation-based feature selection. Incremental iteration has been applied in some algorithms to improve the efficiency. However, the computational efficiency of correlation has generally be ignored. An algorithm acceleration framework for correlation-based feature selection (AFCFS) is proposed. In AFCFS, the criterion of the feature selection will be analyzed and reconstructed based on entropy granularity, and the algorithm structure will also be adjusted accordingly. Specifically, all repeated part of calculation will be saved in mapping tables and can be accessed in next time directly, so as to further reduce the calculation repetition rate and improve the efficiency. The experimental results show that AFCFS can greatly reduce the cost time of these algorithms, and keep the corresponding classification accuracy basically unchanged.
Highlights
Correction-based feature selection has been widely used in software defect prediction to construct the feature subset due to its simple principle and good stability
A few researchers have realized the expensive time cost of the feature selection based on correlation and optimized their algorithm structure such as Feature selection with redundancy-complementariness dispersion (RCDFS)[2], Interaction Weight based Feature Selection algorithm (IWFS) [3], and fast greedy feature selection algorithm (FGS_KDE) [4].most of them only optimize the number of iterations by incremental iteration or weight update, and pay little attention on the calculation process of correlation itself
In order to avoid the influence of classifiers, two typical classifier algorithms, K-Nearest Neighbor (KNN) and Naive Bayes (NB) classifiers, are used
Summary
Correction-based feature selection has been widely used in software defect prediction to construct the feature subset due to its simple principle and good stability. A few researchers have realized the expensive time cost of the feature selection based on correlation and optimized their algorithm structure such as Feature selection with redundancy-complementariness dispersion (RCDFS)[2], Interaction Weight based Feature Selection algorithm (IWFS) [3], and fast greedy feature selection algorithm (FGS_KDE) [4].most of them only optimize the number of iterations by incremental iteration or weight update, and pay little attention on the calculation process of correlation itself. The repeated calculations can be avoided to the greatest extent, so as to improve the operating efficiency without changing the performance of the algorithm itself
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.