Abstract

A well-constructed classification model highly depends on input feature subsets from a dataset, which may contain redundant, irrelevant, or noisy features. This challenge can be worse while dealing with medical datasets. The main aim of feature selection as a pre-processing task is to eliminate these features and select the most effective ones. In the literature, metaheuristic algorithms show a successful performance to find optimal feature subsets. In this paper, two binary metaheuristic algorithms named S-shaped binary Sine Cosine Algorithm (SBSCA) and V-shaped binary Sine Cosine Algorithm (VBSCA) are proposed for feature selection from the medical data. In these algorithms, the search space remains continuous, while a binary position vector is generated by two transfer functions S-shaped and V-shaped for each solution. The proposed algorithms are compared with four latest binary optimization algorithms over five medical datasets from the UCI repository. The experimental results confirm that using both bSCA variants enhance the accuracy of classification on these medical datasets compared to four other algorithms.

Highlights

  • By advancing in the technology, a massive amount of data is regularly generated and stored from real-world applications such as medical, transportation, tourism and engineering

  • The results of the proposed binary versions of the sine cosine algorithm (SCA), shaped binary Sine Cosine Algorithm (SBSCA), and V-shaped binary Sine Cosine Algorithm (VBSCA) are compared with other binary metaheuristic algorithms which are widely used to solve the feature selection problem

  • The proposed algorithms are employed in feature selection problem for disease detection using k-nearest neighbor classifier (KNN) classifier

Read more

Summary

INTRODUCTION

By advancing in the technology, a massive amount of data is regularly generated and stored from real-world applications such as medical, transportation, tourism and engineering. Advanced Computing: An International Journal (ACIJ), Vol., No.1/2/3/4/5, September 2019 consistency of the features to find the optimal subset The latter uses a specific classifier to evaluate the quality of selected features and find the near-optimal solutions from an exponential set of features. With increasing the number of variables and complexity of the problems, the high dimensional problems are an emerging issue, and recently some metaheuristic algorithms such as conscious neighborhood-based crow search algorithm (CCSA) [18] have been proposed for solving large-scale optimization problems. Sine Cosine Algorithm was recently proposed for continuous optimization problems which attracts the attention of many researchers to use its potentials and apply to different applications. Some binary variants of the SCA were proposed for discrete optimization problems, there is no variant of this algorithm for feature selection from medical datasets.

RELATED WORKS
S-shaped Binary Sine Cosine Algorithm (SBSCA)
V-shaped Binary Sine Cosine Algorithm (VBSCA)
BINARY SINE COSINE ALGORITHM FOR THE FEATURE SELECTION
Experimental settings
Numerical Results
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call