An Efficient Iterative Approach to Explainable Feature Learning.

Dino Vlahek,Domen Mongus

doi:10.1109/tnnls.2021.3107049

Abstract

This article introduces a new iterative approach to explainable feature learning. During each iteration, new features are generated, first by applying arithmetic operations on the input set of features. These are then evaluated in terms of probability distribution agreements between values of samples belonging to different classes. Finally, a graph-based approach for feature selection is proposed, which allows for selecting high-quality and uncorrelated features to be used in feature generation during the next iteration. As shown by the results, the proposed method improved the accuracy of all tested classifiers, where the best accuracies were achieved using random forest. In addition, the method turned out to be insensitive to both of the input parameters, while superior performances in comparison to the state of the art were demonstrated on nine out of 15 test sets and achieving comparable results in the others. Finally, we demonstrate the explainability of the learned feature representation for knowledge discovery.

Highlights

F EATURE learning, or representation learning, describes a set of techniques that allow for defining augmented data representation for improved utilization of classification or regression models [1]
While feature engineering traditionally consists of user administered feature construction, feature evaluation, and feature selection steps [9], [10], feature learning follows these principles in an automated manner
As an alternative to the existing feature selection approaches that are either computationally expensive or unable to deal with correlated features, we proposed a new graph cut-based filtering technique that allows for extracting a subset of high-quality dissimilar features FiNt+1 ⊆ FiNt +M from the concatenated input feature space FiNt +M

Summary

Introduction

F EATURE learning, or representation learning, describes a set of techniques that allow for defining augmented data representation for improved utilization of classification or regression models [1]. While feature engineering traditionally consists of user administered feature construction, feature evaluation, and feature selection steps [9], [10], feature learning follows these principles in an automated manner. Feature learning methods can be divided into unsupervised and supervised approaches [11]. While the former learn from unlabelled data, they rely on data transformations (i.e., feature constructions) and feature selections in order to Manuscript received October 30, 2020; revised April 9, 2021; accepted August 18, 2021.

Methods

Results

Conclusion