Abstract

Data classification is a challenging problem. Data classification is very sensitive to the noise and high dimensionality of the data. Being able to reduce the model complexity can help to improve the accuracy of the classification model performance. Therefore, in this research, we propose a novel feature selection technique based on Binary Harris Hawks Optimizer with Time-Varying Scheme (BHHO-TVS). The proposed BHHO-TVS adopts a time-varying transfer function that is applied to leverage the influence of the location vector to balance the exploration and exploitation power of the HHO. Eighteen well-known datasets provided by the UCI repository were utilized to show the significance of the proposed approach. The reported results show that BHHO-TVS outperforms BHHO with traditional binarization schemes as well as other binary feature selection methods such as binary gravitational search algorithm (BGSA), binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary whale optimization algorithm (BWOA), and binary salp swarm algorithm (BSSA). Compared with other similar feature selection approaches introduced in previous studies, the proposed method achieves the best accuracy rates on 67% of datasets.

Highlights

  • Data mining is determined as an important step in the knowledge discovery process.It has become an active research domain due to the presence of huge collections of digital data that need to be explored and transformed into useful patterns

  • Following the hold-out method, each dataset is arbitrarily split into two portions, where 80% of the data were preserved for training while the rest was employed for testing

  • Optimization Algorithm (GOA) in [60], Gravitational Search Algorithm (GSA) boosted with evolutionary crossover and mutation operators in [61], GOA with Evolutionary Population Dynamics (EPD) stochastic search strategies in [62], BDA [35], hybrid approach based on Grey Wolf Optimization (GWO) and Particle Swarm Optimization (PSO) in [12] and Binary Butterfly Optimization Algorithm (BOA) [63]

Read more

Summary

Introduction

Data mining is determined as an important step in the knowledge discovery process. It has become an active research domain due to the presence of huge collections of digital data that need to be explored and transformed into useful patterns. Wrapper approaches mainly consider a machine learning classifier such as K-Nearest Neighbors (KNN) or Support Vector Machines (SVM) to evaluate the feature subset Another aspect for categorizing FS methods is based on the selection mechanism that is used to explore the feature space, searching for the most informative features. Several binarization schemes have been introduced to adapt real-valued meta-heuristics to deal with discrete search space The second binarization scheme is commonly used for adapting meta-heuristics to work in binary search space In this regard, Transfer Functions (TFs) are defined depending on their shapes into two types: S-shaped and V-shaped [31,32,33].

Related Works
Exploration Phase
Moving from Exploration to Exploitation
Exploitation Phase
Soft Besiege
Hard Besiege
Soft Besiege with Progressive Rapid Dives
Hard Besiege with Progressive Rapid Dives
Proposed Binary HHO
BHHO-Based FS
Results and Discussion
Comparison with Other Optimization Algorithms
Comparison with Results of Previous Works
Conclusions and Future Directions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call