Intelligent ensembling of auto-ML system outputs for solving classification problems

Juan Pablo Consuegra-Ayala,Yoan Gutiérrez,Yudivian Almeida-Cruz,Manuel Palomar

doi:10.1016/j.ins.2022.07.061

Juan Pablo Consuegra-Ayala, Yoan Gutiérrez + Show 2 more

https://doi.org/10.1016/j.ins.2022.07.061

Copy DOI

Abstract

Automatic Machine Learning (Auto-ML) tools enable the automatic solution of real-world problems through machine learning techniques. These tools tend to be more time consuming than standard machine learning libraries, therefore, exploiting all the available resources to the full is a valuable feature. This paper presents a two-phase optimization system for solving classification problems. The system is designed to produce more robust classifiers by exploiting the different architectures that are generated while solving classification problems with Auto-ML tools, particularly AutoGOAL. In the first phase, the system follows a probabilistic strategy to find the best combination of algorithms and hyperparameters to generate a collection of base models according to certain diversity criteria; and in the second, it follows similar Auto-ML strategies to ensemble those models. The HAHA 2019 challenge corpus and the Adult dataset were used to evaluate the system. The experimental results show that: i) a better solution can be built by ensembling a subset of the already tested models; ii) the performance of ensemble methods depends on the collection of base models used; and, iii) ensuring diversity using the double-fault measure produces better results than the disagreement measure. The source code is available online for the research community.

Full Text