QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Minghan Fu,Fang-Xiang Wu

doi:10.1609/aaai.v38i11.29095

Abstract

The learning rate is a critical hyperparameter for deep learning tasks since it determines the extent to which the model parameters are adjusted during the learning course. However, the choice of learning rates typically depends on empirical judgment, which may not result in satisfactory outcomes without intensive try-and-error experiments. In this study, we propose a novel learning rate adaptation scheme called QLABGrad. Without any user-specified hyperparameter, QLABGrad automatically determines the learning rate by optimizing the quadratic loss approximation-based (QLAB) function for a given gradient descent direction, where only one extra forward propagation is required. We theoretically prove the convergence of QLABGrad under the smooth Lipschitz condition on the loss function. Experiment results on multiple architectures, including MLP, CNN, and ResNet, on MNIST, CIFAR10, and ImageNet datasets, demonstrate that QLABGrad outperforms widely adopted schemes for deep learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 2

Similar Papers

Aperture Shape Generation Based on Gradient Descent With Momentum
Liyuan Zhang ... Pengcheng Zhang
IEEE Access | VOL. 7
Liyuan Zhang, et. al.Liyuan Zhang ... Pengcheng Zhang
01 Jan 2019
IEEE Access | VOL. 7

Cyclical Learning Rates for Training Neural Networks
Leslie N Smith
-
Leslie N SmithLeslie N Smith
01 Mar 2017
01 Mar 2017

Author response: Neural learning rules for generating flexible predictions and computing the successor representation
Ching Fang ... Dmitriy Aronov
-
Ching Fang, et. al.Ching Fang ... Dmitriy Aronov
12 Oct 2022
12 Oct 2022

Editor's evaluation: Neural learning rules for generating flexible predictions and computing the successor representation
Srdjan Ostojic
-
Srdjan OstojicSrdjan Ostojic
29 Aug 2022
29 Aug 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence