Abstract
Data classification is a challenging problem. Data classification is very sensitive to the noise and high dimensionality of the data. Being able to reduce the model complexity can help to improve the accuracy of the classification model performance. Therefore, in this research, we propose a novel feature selection technique based on Binary Harris Hawks Optimizer with Time-Varying Scheme (BHHO-TVS). The proposed BHHO-TVS adopts a time-varying transfer function that is applied to leverage the influence of the location vector to balance the exploration and exploitation power of the HHO. Eighteen well-known datasets provided by the UCI repository were utilized to show the significance of the proposed approach. The reported results show that BHHO-TVS outperforms BHHO with traditional binarization schemes as well as other binary feature selection methods such as binary gravitational search algorithm (BGSA), binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary whale optimization algorithm (BWOA), and binary salp swarm algorithm (BSSA). Compared with other similar feature selection approaches introduced in previous studies, the proposed method achieves the best accuracy rates on 67% of datasets.
Highlights
Data mining is determined as an important step in the knowledge discovery process.It has become an active research domain due to the presence of huge collections of digital data that need to be explored and transformed into useful patterns
Following the hold-out method, each dataset is arbitrarily split into two portions, where 80% of the data were preserved for training while the rest was employed for testing
Optimization Algorithm (GOA) in [60], Gravitational Search Algorithm (GSA) boosted with evolutionary crossover and mutation operators in [61], GOA with Evolutionary Population Dynamics (EPD) stochastic search strategies in [62], BDA [35], hybrid approach based on Grey Wolf Optimization (GWO) and Particle Swarm Optimization (PSO) in [12] and Binary Butterfly Optimization Algorithm (BOA) [63]
Summary
Data mining is determined as an important step in the knowledge discovery process. It has become an active research domain due to the presence of huge collections of digital data that need to be explored and transformed into useful patterns. Wrapper approaches mainly consider a machine learning classifier such as K-Nearest Neighbors (KNN) or Support Vector Machines (SVM) to evaluate the feature subset Another aspect for categorizing FS methods is based on the selection mechanism that is used to explore the feature space, searching for the most informative features. Several binarization schemes have been introduced to adapt real-valued meta-heuristics to deal with discrete search space The second binarization scheme is commonly used for adapting meta-heuristics to work in binary search space In this regard, Transfer Functions (TFs) are defined depending on their shapes into two types: S-shaped and V-shaped [31,32,33].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.