Abstract

Random Forests (RFs) are powerful ensemble learning algorithms that are widely used in various machine learning tasks. However, they tend to overfit noisy or irrelevant features, which can result in decreased generalization performance. Post-hoc regularization techniques aim to solve this problem by modifying the structure of the learned ensemble after training. We propose a novel post-hoc regularization via tree smoothing for classification tasks to leverage the reliable class distributions closer to the root node whilst reducing the impact of more specific and potentially noisy splits deeper in the tree. Our novel approach allows for a form of pruning that does not alter the general structure of the trees, adjusting the influence of nodes based on their proximity to the root node. We evaluated the performance of our method on various machine learning benchmark data sets and on cancer data from The Cancer Genome Atlas (TCGA). Our approach demonstrates competitive performance compared to the state-of-the-art and, in the majority of cases, and outperforms it in most cases in terms of prediction accuracy, generalization, and interpretability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.