A Novel Extreme Value Theory Based approach to Hyperparameter Optimization

Vivek Kumar Mishra,Arun Rajagopalan

doi:10.1016/j.procs.2023.01.216

Abstract

Finding the optimal values of hyperparameters is crucial for many machine learning algorithms. Algorithms like deep learning, XGBoost, etc. require us to choose among a large set of hyperparameters, thus increasing the computational requirements for a successful hyperparameter search. Another factor complicating the hyperparameter search process is the variability in performance due to choice of different initial random seeds. This variability might be caused by different initial weight assignments or by different environment interactions in the training process. In this paper an extreme value theory (EVT) based approach for hyperparameter optimization in this noisy setting is proposed. To the best of our knowledge this is the first work based on EVT for hyperparameter optimization. It is demonstrated through extensive experimentation on many real world datasets (including a medical diabetes dataset consisting of several medical predictor variables), that by using only a fraction of observations near optimal hyperparameter values can be obtained using the parameters of a generalized extreme value distribution for the test set performance metric of interest in the algorithm.

Full Text