ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Amir Gholaminejad,Kurt Keutzer,George Biros

doi:10.24963/ijcai.2019/103

Abstract

Residual neural networks can be viewed as the forward Euler discretization of an Ordinary Differential Equation (ODE) with a unit time step. This has recently motivated researchers to explore other discretization approaches and train ODE based networks. However, an important challenge of neural ODEs is their prohibitive memory cost during gradient backpropogation. Recently a method proposed in arXiv:1806.07366, claimed that this memory overhead can be reduced from LNt, where Nt is the number of time steps, down to O(L) by solving forward ODE backwards in time, where L is the depth of the network. However, we will show that this approach may lead to several problems: (i) it may be numerically unstable for ReLU/non-ReLU activations and general convolution operators, and (ii) the proposed optimize-then-discretize approach may lead to divergent training due to inconsistent gradients for small time step sizes. We discuss the underlying problems, and to address them we propose ANODE, a neural ODE framework which avoids the numerical instability related problems noted above. ANODE has a memory footprint of O(L) + O(Nt), with the same computational cost as reversing ODE solve. We furthermore, discuss a memory efficient algorithm which can further reduce this footprint with a tradeoff of additional computational cost. We show results on Cifar-10/100 datasets using ResNet and SqueezeNext neural networks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The advance of neural ordinary differential ordinary differential equations
Haoxuan Li
Applied and Computational Engineering | VOL. 6
Haoxuan LiHaoxuan Li
14 Jun 2023
Applied and Computational Engineering | VOL. 6

Accelerating Neural ODEs Using Model Order Reduction.
Mikko Lehtimäki ... Marja-Leena Linne
IEEE Transactions on Neural Networks and Learning Systems | VOL. 35
Mikko Lehtimäki, et. al.Mikko Lehtimäki ... Marja-Leena Linne
01 Jan 2024
IEEE Transactions on Neural Networks and Learning Systems | VOL. 35

LFT: Neural Ordinary Differential Equations With Learnable Final-Time.
Dong Pang ... Xinping Guan
IEEE Transactions on Neural Networks and Learning Systems | VOL. 35
Dong Pang, et. al.Dong Pang ... Xinping Guan
01 May 2024
IEEE Transactions on Neural Networks and Learning Systems | VOL. 35

Improving neural ordinary differential equations via knowledge distillation
Haoyu Chu ... Shikui Wei
IET Computer Vision | VOL. 18
Haoyu Chu, et. al.Haoyu Chu ... Shikui Wei
06 Nov 2023
IET Computer Vision | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Abstract

Talk to us

Similar Papers