Abstract

Diversity forests are a class of random forest type prediction methods that modifies the split selection procedure of conventional random forests to allow for complex split procedures. While random forests show strong prediction performance when using conventional univariate, binary splitting, the procedure still has disadvantages. For example, interactions between features are not exploited effectively. The split selection procedure of diversity forests consists of choosing the best splits from sets of 'nsplits' candidate splits obtained by random selection from repeatedly sampled, specifically structured collections of splits. This makes complex split procedures computationally tangible while avoiding overfitting. This paper focuses on introducing diversity forests and evaluating its performance for univariate, binary splitting. Specific, complex split procedures will be the focus of future work. Using a collection of 220 real data sets with binary target variables, diversity forests are compared with conventional random forests and random forests using extremely randomized trees. It is seen that randomizing the split selection, as performed by diversity forests, leads to slight improvements in prediction performance and that this performance is quite robust with regard to the specified 'nsplits' value. These results indicate that diversity forests are well suited for realizing complex split procedures in random forests.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.