Greed Is Good: Rapid Hyperparameter Optimization and Model Selection Using Greedy k-Fold Cross Validation

Daniel S Soper

doi:10.3390/electronics10161973

Abstract

Selecting a final machine learning (ML) model typically occurs after a process of hyperparameter optimization in which many candidate models with varying structural properties and algorithmic settings are evaluated and compared. Evaluating each candidate model commonly relies on k-fold cross validation, wherein the data are randomly subdivided into k folds, with each fold being iteratively used as a validation set for a model that has been trained using the remaining folds. While many research studies have sought to accelerate ML model selection by applying metaheuristic and other search methods to the hyperparameter space, no consideration has been given to the k-fold cross validation process itself as a means of rapidly identifying the best-performing model. The current study rectifies this oversight by introducing a greedy k-fold cross validation method and demonstrating that greedy k-fold cross validation can vastly reduce the average time required to identify the best-performing model when given a fixed computational budget and a set of candidate models. This improved search time is shown to hold across a variety of ML algorithms and real-world datasets. For scenarios without a computational budget, this paper also introduces an early stopping algorithm based on the greedy cross validation method. The greedy early stopping method is shown to outperform a competing, state-of-the-art early stopping method both in terms of search time and the quality of the ML models selected by the algorithm. Since hyperparameter optimization is among the most time-consuming, computationally intensive, and monetarily expensive tasks in the broader process of developing ML-based solutions, the ability to rapidly identify optimal machine learning models using greedy cross validation has obvious and substantial benefits to organizations and researchers alike.

Highlights

Organizational development and adoption of artificial intelligence (AI) and machine learning (ML) technologies has exploded in popularity in recent years, with the total business value and total global spending on these technologies expected to reach USD3.9 trillion and USD 77.6 billion by 2022, respectively [1,2]
These results provide strong statistical evidence for the superiority of the greedy k-fold method over the standard k-fold method in identifying optimal or near-optimal ML models when operating under the constraints of a computational budget
This paper developed and presented two variants of a greedy k-fold cross validation algorithm and subsequently evaluated their performance in a wide array of hyperparameter optimization and ML model selection tasks

Summary

Introduction

Organizational development and adoption of artificial intelligence (AI) and machine learning (ML) technologies has exploded in popularity in recent years, with the total business value and total global spending on these technologies expected to reach USD3.9 trillion and USD 77.6 billion by 2022, respectively [1,2]. Despite the widespread availability of cloud-based computational resources, both the execution time required to train today’s complex, state-of-the-art ML models and the cloud computing costs associated with training those models remain major obstacles in many real-world scientific, governmental, and commercial use cases [4]. This problem is often made exponentially worse by the need to perform hyperparameter optimization, wherein a large number of candidate ML models with varying hyperparameter settings are trained and evaluated in an effort to find the best-performing model [5,6]. The scope of the model search problem may perhaps be best understood by considThe scope of the model search problem may perhaps be best understood by considering ering a well-known case from the ML literature

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Aug 16, 2021
Citations: 29	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Greed Is Good: Rapid Hyperparameter Optimization and Model Selection Using Greedy k-Fold Cross Validation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

6999 Utilizing Machine Learning for One-Step Diagnosis Of Adrenal Diseases: Integrating Clinical, Serum Steroid, And Body Composition Data
S Park ... J Kim
Journal of the Endocrine Society | VOL. 8
S Park, et. al.S Park ... J Kim
05 Oct 2024
Journal of the Endocrine Society | VOL. 8

Reinforcement Learning for Model Selection and Hyperparameter Optimization
...
-
, et. al. ...
01 Mar 2020
01 Mar 2020

The SMART Framework: Selection of Machine Learning Algorithms With ReplicaTions-A Case Study on the Microvascular Complications of Diabetes.
Breanna P Swan ... Maria E Mayorga
IEEE Journal of Biomedical and Health Informatics | VOL. 26
Breanna P Swan, et. al.Breanna P Swan ... Maria E Mayorga
01 Feb 2022
IEEE Journal of Biomedical and Health Informatics | VOL. 26

Dynamic Autoselection and Autotuning of Machine Learning Models for Cloud Network Analytics
Rupesh Raj Karn ... Ibrahim Abe M Elfadel
IEEE Transactions on Parallel and Distributed Systems | VOL. 30
Rupesh Raj Karn, et. al.Rupesh Raj Karn ... Ibrahim Abe M Elfadel
01 May 2019
IEEE Transactions on Parallel and Distributed Systems | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Greed Is Good: Rapid Hyperparameter Optimization and Model Selection Using Greedy k-Fold Cross Validation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics