An Efficient hybrid filter-wrapper metaheuristic-based gene selection method for high dimensional datasets

Jamshid Pirgazi,Mohsen Alimoradi,Tahereh Esmaeili Abharian,Mohammad Hossein Olyaee

doi:10.1038/s41598-019-54987-1

Jamshid Pirgazi, Mohsen Alimoradi + Show 2 more

Open Access

https://doi.org/10.1038/s41598-019-54987-1

Copy DOI

Abstract

Feature selection problem is one of the most significant issues in data classification. The purpose of feature selection is selection of the least number of features in order to increase accuracy and decrease the cost of data classification. In recent years, due to appearance of high-dimensional datasets with low number of samples, classification models have encountered over-fitting problem. Therefore, the need for feature selection methods that are used to remove the extensions and irrelevant features is felt. Recently, although, various methods have been proposed for selecting the optimal subset of features with high precision, these methods have encountered some problems such as instability, high convergence time, selection of a semi-optimal solution as the final result. In other words, they have not been able to fully extract the effective features. In this paper, a hybrid method based on the IWSSr method and Shuffled Frog Leaping Algorithm (SFLA) is proposed to select effective features in a large-scale gene dataset. The proposed algorithm is implemented in two phases: filtering and wrapping. In the filter phase, the Relief method is used for weighting features. Then, in the wrapping phase, by using the SFLA and the IWSSr algorithms, the search for effective features in a feature-rich area is performed. The proposed method is evaluated by using some standard gene expression datasets. The experimental results approve that the proposed approach in comparison to similar methods, has been achieved a more compact set of features along with high accuracy. The source code and testing datasets are available at https://github.com/jimy2020/SFLA_IWSSr-Feature-Selection.

Highlights

Feature selection problem is one of the most significant issues in data classification
In the filter phase, the Relief method is used for weighting the features
In the proposed method in the filter phase, the Relief method is used for weighting the features

Summary

Introduction

Feature selection problem is one of the most significant issues in data classification. Various methods have been proposed for selecting the optimal subset of features with high precision, these methods have encountered some problems such as instability, high convergence time, selection of a semi-optimal solution as the final result. Different methods proposed for selecting a subset of features, have encountered some problems such as instability, high convergence time and falling in local optima as a final result, etc. Despite the success they have gained, they have not been able to extract the most effective features. Each of these methods is described in detail[3]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific reports	Publication Date: Dec 1, 2019
Citations: 62	License type: open-access

R Discovery Prime

R Discovery Prime

An Efficient hybrid filter-wrapper metaheuristic-based gene selection method for high dimensional datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports

Lead the way for us

Similar Papers

Feature Selection for Opinion Mining Using Shuffled Frog Leaping Algorithm
...
International Journal Of Engineering And Computer Science | VOL. 7
, et. al. ...
28 Feb 2018
International Journal Of Engineering And Computer Science | VOL. 7

Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm
Bin Hu ... Xiaowei Zhang
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15
Bin Hu, et. al.Bin Hu ... Xiaowei Zhang
24 Aug 2016
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15

Whale Optimisation Algorithm for high-dimensional small-instance feature selection
Majdi Mafarja ... Thaer Thaher
International Journal of Parallel, Emergent and Distributed Systems | VOL. 36
Majdi Mafarja, et. al.Majdi Mafarja ... Thaer Thaher
27 May 2019
International Journal of Parallel, Emergent and Distributed Systems | VOL. 36

Whale Optimization Algorithm for High-dimensional Small-Instance Feature Selection
Majdi Mafarja ... Sobhi Ahmed
-
Majdi Mafarja, et. al.Majdi Mafarja ... Sobhi Ahmed
01 Oct 2018
01 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient hybrid filter-wrapper metaheuristic-based gene selection method for high dimensional datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports