Abstract

The Random Subspace Method (RSM) is an ensemble procedure in which each constituent learner is constructed using a randomly chosen subset of the data features. Regression trees are ideal candidate learners in RSM ensembles. By constructing trees upon different feature subsets, RSM reduces correlation between trees resulting in a stronger ensemble. Furthermore, it lessens computational burden by only considering a subset of the features when building each tree. Despite its apparent advantages, RSM has a notable drawback. In some instances a randomly chosen subspace may lack informative features. This is especially true in situations in which the number of truly informative variables is small relative to the total number of variables. Trees that are constructed using feature subsets lacking informative features can be damaging to the ensemble. Here we present Grafted Random Subspaces (GRS) and Vanishing Random Subspaces (VRS), two novel ensemble procedures designed to remedy the aforementioned drawback by reusing information across trees. Both techniques borrow from RSM by growing individual trees on randomly selected feature subsets. For each tree in a GRS ensemble, the most important variable is identified and guaranteed inclusion into the next q feature subsets. This allows GRS to recycle a promising feature from one tree across several successive trees, effectively grafting the variable into the next q active subsets. In the VRS procedure the least important feature is guaranteed exclusion from the next q feature subsets. This creates a more enriched pool of candidate variables from which the successive feature subsets are drawn.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.