Abstract

Feature selection has shown its effectiveness in improving the accuracy and generalization of machine learning models, especially for those tasks with high-dimensional data. In this article, a novel self-learning feature selection (SLFS) approach based on feature attributions is proposed as a wrapper method, which has higher search efficiency for optimal feature subsets with three main improvements. First, we regard feature selection as a combinatorial optimization problem and propose a unified local search framework for wrapper methods by analyzing meta-heuristic algorithms in feature selection. Second, for the binary search space of feature selection, we propose two types of neighborhood structures, namely, ring-type and line-type structures, for the local search framework. Third, we focus on feature attribution methods, such as SHAP (SHapley Additive exPlanations) (Lundberg & Lee, 2017), which can interpret each feature’s importance to predictions. In each iteration, we adopt SHAP values and other attributes from previous subsets to guide the next selection of new subsets. To validate the performance of our SLFS approach, we collected 16 classification datasets from the UCI repository for comparison with other meta-heuristic wrapper approaches in terms of fitness, accuracy, F1 scores and selection ratios. The experimental results show that the SLFS approach can be used to obtain an optimal subset with fewer iterations and a small population, and SHAP values play a role in improving search efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.