Abstract

Hyperparameter tuning and model selection are important steps in machine learning. Unfortunately, classical hyperparameter calibration and model selection procedures are sensitive to outliers and heavy-tailed data. In this work, we construct a selection procedure which can be seen as a robust alternative to cross-validation and is based on a median-of-means principle. Using this procedure, we also build an ensemble method which, trained with algorithms and corrupted heavy-tailed data, selects an algorithm, trains it with a large uncorrupted subsample and automatically tunes its hyperparameters. In particular, the approach can transform any procedure into a robust to outliers and to heavy-tailed data procedure while tuning automatically its hyperparameters. The construction relies on a divide-and-conquer methodology, making this method easily scalable even on a corrupted dataset. This method is tested with the LASSO which is known to be highly sensitive to outliers.

Highlights

  • Robustness has become an important subject of interest in the machine learning community over the last few years because large datasets are very likely to be corrupted

  • Robust alternatives to empirical risk minimizers and their penalized/regularized versions have been studied in density estimation [5] and least-squares regression [4, 36, 20, 50, 55]

  • To compute the minmax-MOM selection procedure in the context of the ensemble method defined in Section 4.1, the empirical risk of each estimator fm has to be computed on the 2K0 -partition only, which thanks to (4.4) means the computation of at most 8V |M|/3 empirical risks, as advertised

Read more

Summary

Introduction

Robustness has become an important subject of interest in the machine learning community over the last few years because large datasets are very likely to be corrupted. Even if some candidate estimators are robust, outliers from the test set may mislead the selection/aggregation step, resulting in a poor final estimator. This raises the question of a robust selection/aggregation procedure, which is addressed in the present work. Theoretical guarantees for the latter are given in Theorem 3.2. The proofs are outsourced in the appendix in Appendices A and B

Setting
Minmax-MOM selection: a robust alternative to cross-validation
Definition of the method
Theoretical guarantees
An efficient partition scheme of the dataset
Application to fine-tuning the regularization parameter of the LASSO
Application to ERM and linear aggregation
Presentation
On the choices of V and Kmax
Results and discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call