Abstract
We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general local-global regularization for tree-based models. The new method allows for full flexibility in the choice of feature-specific importance weights, while also applying a global penalization. We validate our method on both simulated and real data, exploring how the hyperparameters interact and we provide the implementation as an extension of the popular R package ranger.
Highlights
In many Machine Learning problems, features can be hard or economically expensive to obtain, and some may be irrelevant or poorly linked to the target
For tree-based methods, there is no standard procedure for feature selection or regularization in the literature, as one would find for Linear Regression and the LASSO [2] for example
We provide a general gain penalization procedure for tree-based models, which allows for a combination of local and global regularization parameters
Summary
In many Machine Learning problems, features can be hard or economically expensive to obtain, and some may be irrelevant or poorly linked to the target. In [5], the authors treat trees as parametric models and use procedures analogous to LASSO-type shrinkage methods, by penalizing the coefficients of the base learners and reducing the redundancy in each path from the root node to a leaf node. Their selected features can still be redundant, since the focus is on reducing the number of rules instead of the number of features. We provide a general gain penalization procedure for tree-based models, which allows for a combination of local and global regularization parameters.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have