Abstract

This article introduces Alternated Sorting Method Genetic Algorithm (ASMGA), a simultaneous feature selection and model selection algorithm for Support Vector Machine (SVM) classifiers. It is a hybrid wrapper-filter algorithm that combines Genetic Algorithm (GA) with Max-Margin Feature Selection (MMFS). MMFS is a filter model that estimates feature importance based on feature relevance and redundancy. In this research, the idea of different error costs was used to introduce cost sensitivity to ASMGA. Thus, ASMGA selects relevant and independent features in a cost-sensitive manner. This research investigates the relationship between the cost sensitivity of SVM models produced by ASMGA and the feature selection process. ASMGA approximates a set of Pareto optimal feature subsets based on three objectives: cost-sensitive error rate, feature subset size, and MMFS-based estimates of feature relevance and redundancy. This research introduces a technique for handling multiple objectives. During the search, ASMGA alternates between two multi-objective sorting techniques: Weighted Sum (WS) of objectives and Non-dominated Sorting (NDS), according to a schedule of methods. This technique allows ASMGA to work as elitist GA for some iterations and as a Non-dominated Sorting Genetic Algorithm (NSGA-II) for the remaining iterations. The proposed algorithm was tested on 11 benchmark datasets and compared to canonical GA and NSGA-II. The algorithm and its variations performed on average 3.8% better than GA and NSGA-II for balanced datasets and 6.6% better for imbalanced datasets. The results and analysis in this article showcase the potential of ASMGA and help explain the interaction between cost sensitivity and feature selection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call