Analyzing inexact hypergradients for bilevel learning

Matthias J Ehrhardt,Lindon Roberts

doi:10.1093/imamat/hxad035

Matthias J Ehrhardt, Lindon Roberts

Open Access

https://doi.org/10.1093/imamat/hxad035

Copy DOI

Abstract

Abstract Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IMA Journal of Applied Mathematics	Publication Date: Nov 30, 2023
Citations: 1	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Analyzing inexact hypergradients for bilevel learning

Abstract

Talk to us

Similar Papers

More From: IMA Journal of Applied Mathematics

Lead the way for us

Similar Papers

Entropy-Penalized Semidefinite Programming
Mikhail Krechetov ... Martin Takac
-
Mikhail Krechetov, et. al.Mikhail Krechetov ... Martin Takac
01 Aug 2019
01 Aug 2019

Big Data Analytics
Tianbao Yang ... Qihang Lin
-
Tianbao Yang, et. al.Tianbao Yang ... Qihang Lin
10 Aug 2015
10 Aug 2015

Distributed Learning in Non-Convex Environments—Part I: Agreement at a Linear Rate
Stefan Vlaski ... Ali H Sayed
IEEE Transactions on Signal Processing | VOL. 69
Stefan Vlaski, et. al.Stefan Vlaski ... Ali H Sayed
01 Jan 2020
IEEE Transactions on Signal Processing | VOL. 69

Improved Penalty Method via Doubly Stochastic Gradients for Bilevel Hyperparameter Optimization
Wanli Shi ... Bin Gu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35
Wanli Shi, et. al.Wanli Shi ... Bin Gu
18 May 2021
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analyzing inexact hypergradients for bilevel learning

Abstract

Talk to us

Similar Papers

More From: IMA Journal of Applied Mathematics