Abstract

Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the trade-off between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem.. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.

Highlights

  • Structure-preserving numerical schemes have their roots in geometric integration [62], and numerical schemes that is build on characterisations of PDEs as metric gradient flows [5], just to name a few

  • It is well-known that the optimal control formulation can be phrased as a closed dynamical system by using Pontryagin’s principle and that this results in a constrained Hamiltonian boundary value problem [12]

  • Aside from generative adversarial networks (GANs) and variational autoencoders (VAEs), normalising flows are another class of machine learning models that can be used to artificially generate data

Read more

Summary

Introduction

Structure-preserving numerical schemes have their roots in geometric integration [62], and numerical schemes that is build on characterisations of PDEs as metric gradient flows [5], just to name a few. The overarching aim of structure-preserving numerics is to preserve certain properties of the continuous model, e.g. mass or energy conservation, in its discretisation. Structure preservation is not just restricted to play a role in classical numerical analysis of ODEs and PDEs. through the advent of continuum interpretations of neural networks [60, 46, 47, 116], Downloaded from https://www.cambridge.org/core. The main objectives are to use the continuum model and structure-preserving schemes to derive stable and converging neural networks and associated training procedures and algorithms

Neural networks
Residual networks and differential equations
Structure-preserving ODE formulations
Dissipative models and gradient flows
Hamiltonian vector fields
Structure-preserving numerical methods for the ODE model
Numerical methods preserving dissipativity
Numerical methods preserving energy or symplecticity
Splitting methods and shears
Features evolving on Lie groups or homogeneous manifolds
Geometric properties of Hamiltonian models
Measure-preserving models
Manifold based models
Invertible neural networks and normalising flows
Coupling layers
Invertible layers through iterative schemes
Linear invertible components
Memory-efficient backpropagation
Invertible networks as subnetworks
Density estimation and generative modelling
Fast inverses without coupling
Stability guarantees
Deep Learning meets optimal control
Derivative-based algorithms
Method of successive approximation
Regularisation
Deep limits
Open problems
Algorithms with built-in errors
Algorithms without gradients
Equivariant neural networks
Homogeneous spaces
Equivariant linear maps
Equivariant non-linearities
A numerical demonstration of the use of equivariant neural networks
Approximation properties
Approximate equivariance
Structure-exploiting learning
Conformal Hamiltonian systems
Learning in Riemannian metric spaces
Network parameters evolving on manifolds
Information geometry
Optimisation of two-layer ReLU neural networks as Wasserstein gradient flows
Port-Hamiltonian optimisation methods
Convergence analysis for natural gradient optimisation
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call