Generalizing Gain Penalization for Feature Selection in Tree-Based Models

Bruna Wundervald,Andrew C Parnell,Katarina Domijan

doi:10.1109/access.2020.3032095

Bruna Wundervald, Andrew C Parnell + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.3032095

Copy DOI

Abstract

We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general local-global regularization for tree-based models. The new method allows for full flexibility in the choice of feature-specific importance weights, while also applying a global penalization. We validate our method on both simulated and real data, exploring how the hyperparameters interact and we provide the implementation as an extension of the popular R package ranger.

Highlights

In many Machine Learning problems, features can be hard or economically expensive to obtain, and some may be irrelevant or poorly linked to the target
For tree-based methods, there is no standard procedure for feature selection or regularization in the literature, as one would find for Linear Regression and the LASSO [2] for example
We provide a general gain penalization procedure for tree-based models, which allows for a combination of local and global regularization parameters

Summary

INTRODUCTION

In many Machine Learning problems, features can be hard or economically expensive to obtain, and some may be irrelevant or poorly linked to the target. In [5], the authors treat trees as parametric models and use procedures analogous to LASSO-type shrinkage methods, by penalizing the coefficients of the base learners and reducing the redundancy in each path from the root node to a leaf node. Their selected features can still be redundant, since the focus is on reducing the number of rules instead of the number of features. We provide a general gain penalization procedure for tree-based models, which allows for a combination of local and global regularization parameters.

PROBLEM SETUP

GENERALIZING GAIN PENALIZATION

DEPTH PARAMETER

EXPERIMENTS

GENERALIZED GAIN PENALIZATION IN RANDOM

IMPLEMENTATION

CONCLUSION AND NEXT STEPS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 27	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Generalizing Gain Penalization for Feature Selection in Tree-Based Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

A new approach of feature selection for text categorization
Cui Zifeng ... Xu Baowen
Wuhan University Journal of Natural Sciences | VOL. 11
Cui Zifeng, et. al.Cui Zifeng ... Xu Baowen
01 Sep 2006
Wuhan University Journal of Natural Sciences | VOL. 11

A New Approach of Feature Selection for Chinese Web Page Categorization
Cunhe Li ... Kangwei Liu
-
Cunhe Li, et. al.Cunhe Li ... Kangwei Liu
01 Jan 2008
01 Jan 2008

AIFSA: A New Approach for Feature Selection and Weighting
Walid Fouad ... Ibrahim Farag
-
Walid Fouad, et. al.Walid Fouad ... Ibrahim Farag
01 Jan 2010
01 Jan 2010

Feature selection by separability assessment of input spaces for transient stability classification based on neural networks
S.K Tso ... X.P Gu
International Journal of Electrical Power and Energy Systems | VOL. 26
S.K Tso, et. al.S.K Tso ... X.P Gu
01 Dec 2003
International Journal of Electrical Power and Energy Systems | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Generalizing Gain Penalization for Feature Selection in Tree-Based Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions