Abstract

Nowadays increasing dimensionality of data produces several issues in machine learning. Therefore, it is needed to decrease the number of features by choosing just the most important ones and eliminating duplicate features, also reducing the number of features that are important to the model. For this purpose, many methodologies known as Feature Selection are applied. In this study, a feature selection approach is proposed based on Swarm Intelligence methods, which search for the best points in the search area to achieve optimization. In this paper, a wrapper feature selection technique based on the Dragonfly algorithm is proposed. The dragonfly optimization technique is used to find the optimal subset of features that could accurately classify breast cancer as benign or malignant. Many times, the fitness function is defined as classification accuracy. In this study, hard vote classes are employed as a model developed to evaluate feature subsets that have been chosen. It is used as an evaluation function (fitness function) to evaluate each dragonfly in the population. The proposed ensemble hard voting classifier utilizes a combination of five machine-learning algorithms to produce a binary classification for feature selection: Support Vector Machine (SVM), K-Nearest Neighbors (K-NN), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF). According to the results of the experiments, the voting ensemble classifier has the greatest accuracy value among the single classifiers. The proposed method showed that when training the subset features, the accuracy generated by the voting classifier is high at 98.24%, whereas the training of all features achieved an accuracy of 96.49%. The proposed approach makes use of the UCI repository's Wisconsin Diagnostic Breast Cancer (WDBC) Dataset. Which consists of 569 instances and 30 features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call