Binary Multi-Objective Grey Wolf Optimizer for Feature Selection in Classification

Qasem Al-Tashi,Helmi Md Rais,Said Jadid Abdulkadir,Mohammed G Ragab,Hitham Alhussian,Seyedali Mirjalili,Alawi Alqushaibi

doi:10.1109/access.2020.3000040

Abstract

Feature selection or dimensionality reduction can be considered as a multi-objective minimization problem with two objectives: minimizing the number of features and minimizing the error rate simultaneously. Despite being a multi-objective problem, most existing approaches treat feature selection as a single-objective optimization problem. Recently, Multi-objective Grey Wolf optimizer (MOGWO) was proposed to solve multi-objective optimization problem. However, MOGWO was originally designed for continuous optimization problems and hence, it cannot be utilized directly to solve multi-objective feature selection problems which are inherently discrete in nature. Therefore, in this research, a binary version of MOGWO based on sigmoid transfer function called BMOGW-S is developed to optimize feature selection problems. A wrapper based Artificial Neural Network (ANN) is used to assess the classification performance of a subset of selected features. To validate the performance of the proposed method, 15 standard benchmark datasets from the UCI repository are employed. The proposed BMOGWO-S was compared with MOGWO with a tanh transfer function and Non-dominated Sorting Genetic Algorithm (NSGA-II) and Multi-objective Particle Swarm Optimization (MOPSO). The results showed that the proposed BMOGWO-S can effectively determine a set of non-dominated solutions. The proposed method outperforms the existing multi-objective approaches in most cases in terms of features reduction as well as classification error rate while benefiting from a lower computational cost.

Highlights

IntroductionThe main goal of data mining is to extract useful information embedded in the data and convert it through multiple phases (e.g., pre and post processing as well as visualization) into a simple format that users can understand [1]
Data mining is an important branch of artificial intelligence and machine learning
BMOGWO-S selected approximately 45% from the orignal features (4 from 9) in the Breastcancer dataset and selected 35% from the original features (5 from 13) in the HeartEW dataset, in the WineEW (6 from 13) features, in Zoo dataset only 5 features from 16 features have been selected, in Lymphography dataset 5 feature from 18 features, all the mentioned datasets can be categorized as small size datasets where mostly in all small size datasets, the features number was minimized to 30% or less, but attained less error rate than utilizing full features

Summary

Introduction

The main goal of data mining is to extract useful information embedded in the data and convert it through multiple phases (e.g., pre and post processing as well as visualization) into a simple format that users can understand [1]. It is hard to detect the valuable features from a large set of features, because the space of search is generally large, wherein a dataset contains a huge number of features that comprise redundant and unnecessary features, which leads in-turn to less performance on the classification [2]. The main goal of feature selection is to remove redundant and irrelevant features and build the model more efficiently. Feature selection is useful for improving the classification performance, simplifying the learning model and shortening the time of training [3]

Methods

Results

Discussion

Conclusion