Abstract

Ensemble pruning that combines a subset of individual learners generated in parallel to make predictions is an important topic in ensemble learning. Past decades have developed a lot of pruning algorithms that focus on the external behavior of learners on samples, which may lead to over-fitting. In this paper, we conjecture that the generalization performance of an ensemble is not only related to its external behavior on samples but also dependent on the internal structure of individual learners. We propose the general MEPSI approach based on Kolmogorov complexity and the Minimum Description Length (MDL) principle, which formulates the ensemble pruning task as the two-objective optimization problem that comprises the empirical error and structural information among individual learners. We also provide a concrete implementation of MEPSI on decision trees. The theoretical results provide generalization bounds for both the general MEPSI approach and tree-based implementation. The comparative experiments conducted on multiple real-world data sets demonstrate the effectiveness of our proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.