Abstract

Tuning hyperparameters of machine learning models is important for their performance. Bayesian optimization has recently emerged as a de-facto method for this task. The hyperparameter tuning is usually performed by looking at model performance on a validation set. Bayesian optimization is used to find the hyperparameter set corresponding to the best model performance. However, in many cases, where training or validation set has limited set of datapoints, the function representing the model performance on the validation set contains several spurious sharp peaks. The Bayesian optimization, in such cases, has a tendency to converge to sharp peaks instead of other more stable peaks. When a model trained using these hyperparameters is deployed in real world, its performance suffers dramatically. We address this problem through a novel stable Bayesian optimization framework. We construct a new acquisition function that helps Bayesian optimization to avoid the convergence to the sharp peaks. We conduct a theoretical analysis and guarantee that Bayesian optimization using the proposed acquisition function prefers stable peaks over unstable ones. Experiments with synthetic function optimization and hyperparameter tuning for Support Vector Machines show the effectiveness of our proposed framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call