Estimate Of Generalization Error Research Articles

This work introduces Bellman Neural Networks (BeNNs) and employs them to learn the optimal control actions for the class of optimal control problems (OCPs) with integral quadratic cost. BeNNs represent a particular family of Physics-Informed Neural Networks (PINNs) specifically designed and trained to tackle OCPs via applying the Bellman Principle of Optimality (BPO). The BPO provides necessary and sufficient optimality conditions, which result in a nonlinear partial differential equation known as the Hamilton-Jacobi-Bellman (HJB) equation. BeNNs learn the optimal control actions from the unknown solution of the arising HJB equation (i.e., the value function), where the unknown solution is modeled using a Neural Network. Additionally, the paper shows how to estimate the upper bounds on the generalization error of BeNNs while learning the solutions for the OCP class under consideration. The generalization error estimate is provided in terms of the choice and number of the training points as well as the training error. Numerical studies show that BeNNs can be successfully applied to learn the feedback control actions for the class of optimal control problems considered and, after the training is completed, deployed to control the system in a closed-loop fashion. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Impact Statement</i> —The proposed research improves our understanding of how to solve optimal control problems with closed-loop solutions and has potentially a countless number of applications in several different areas. The study is at the intersection between optimal control theory and artificial intelligence connected with mathematical tools for functional interpolation. This advances the ability to implement a higher level of autonomy in decision-making for practical applications with a beneficial impact on our society.

Read full abstract

We introduce a Gaussian Process (GP) generalization of ResNets (with unknown functions of the network replaced by GPs and identified via MAP estimation), which includes ResNets (trained with L2 regularization on weights and biases) as a particular case (when employing particular kernels). We show that ResNets (and their warping GP regression extension) converge, in the infinite depth limit, to a generalization of image registration variational algorithms. In this generalization, images are replaced by functions mapping input/output spaces to a space of unexpressed abstractions (ideas), and material points are replaced by data points. Whereas computational anatomy aligns images via warping of the material space, this generalization aligns ideas (or abstract shapes as in Plato’s theory of forms) via the warping of the Reproducing Kernel Hilbert Space (RKHS) of functions mapping the input space to the output space. While the Hamiltonian interpretation of ResNets is not new, it was based on an Ansatz. We do not rely on this Ansatz and present the first rigorous proof of convergence of ResNets with trained weights and biases towards a Hamiltonian dynamics driven flow. Since our proof is constructive and based on discrete and continuous mechanics, it reveals several remarkable properties of ResNets and their GP generalization. ResNets regressors are kernel regressors with data-dependent warping kernels. Minimizers of L2 regularized ResNets satisfy a discrete least action principle implying the near preservation of the norm of weights and biases across layers. The trained weights of ResNets with scaled/strong L2 regularization can be identified by solving an autonomous Hamiltonian system. The trained ResNet parameters are unique up to (a function of) the initial momentum, and the initial momentum representation of those parameters is generally sparse. The kernel (nugget) regularization strategy provides a provably robust alternative to Dropout for ANNs. We introduce a functional generalization of GPs and show that pointwise GP/RKHS error estimates lead to probabilistic and deterministic generalization error estimates for ResNets. When performed with feature maps, the proposed analysis identifies the (EPDiff) mean fields limit of trained ResNet parameters as the number of data points goes to infinity. The search for good architectures can be reduced to that of good kernels, and we show that the composition of warping regression blocks with reduced equivariant multichannel kernels (introduced here) recovers and generalizes CNNs to arbitrary spaces and groups of transformations.

Read full abstract

Estimate Of Generalization Error Research Articles

Related Topics

Articles published on Estimate Of Generalization Error

Advanced Ensemble Framework for Diabetes Outcome Forecasting

Space-time error estimates for approximations of linear parabolic problems with generalized time boundary conditions

Error estimates for POD-DL-ROMs: a deep learning framework for reduced order modeling of nonlinear parametrized PDEs enhanced by proper orthogonal decomposition

Bellman Neural Networks for the Class of Optimal Control Problems With Integral Quadratic Cost

Applications of the duality theory of convex analysis to the complete electrode model of electrical impedance tomography

Evaluating neural network models in site-specific solar PV forecasting using numerical weather prediction data and weather observations

Invasive or More Direct Measurements Can Provide an Objective Early-Stopping Ceiling for Training Deep Neural Networks on Non-invasive or Less-Direct Biomedical Data

On stability and regularization for data-driven solution of parabolic inverse source problems

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Confidence intervals for the random forest generalization error

A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling

Neural network guided adjoint computations in dual weighted residual error estimation

A Hybrid High-Order method for incompressible flows of non-Newtonian fluids with power-like convective behaviour

A General Error Estimate For Parabolic Variational Inequalities

Assessment of Characteristics and Conditions before the End of Lockdown

Committee neural network potentials control generalization errors and enable active learning.

General least product relative error estimation for multiplicative regression models with or without multiplicative distortion measurement errors

Well-Posedness and Finite Element Approximations for Elliptic SPDEs with Gaussian Noises

Kernel Stability for Model Selection in Kernel-Based Algorithms.

Rescaled Boosting in Classification.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Estimate Of Generalization Error Research Articles

Related Topics

Articles published on Estimate Of Generalization Error

Advanced Ensemble Framework for Diabetes Outcome Forecasting

Space-time error estimates for approximations of linear parabolic problems with generalized time boundary conditions

Error estimates for POD-DL-ROMs: a deep learning framework for reduced order modeling of nonlinear parametrized PDEs enhanced by proper orthogonal decomposition

Bellman Neural Networks for the Class of Optimal Control Problems With Integral Quadratic Cost

Applications of the duality theory of convex analysis to the complete electrode model of electrical impedance tomography

Evaluating neural network models in site-specific solar PV forecasting using numerical weather prediction data and weather observations

Invasive or More Direct Measurements Can Provide an Objective Early-Stopping Ceiling for Training Deep Neural Networks on Non-invasive or Less-Direct Biomedical Data

On stability and regularization for data-driven solution of parabolic inverse source problems

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Confidence intervals for the random forest generalization error

A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling

Neural network guided adjoint computations in dual weighted residual error estimation

A Hybrid High-Order method for incompressible flows of non-Newtonian fluids with power-like convective behaviour

A General Error Estimate For Parabolic Variational Inequalities

Assessment of Characteristics and Conditions before the End of Lockdown

Committee neural network potentials control generalization errors and enable active learning.

General least product relative error estimation for multiplicative regression models with or without multiplicative distortion measurement errors

Well-Posedness and Finite Element Approximations for Elliptic SPDEs with Gaussian Noises

Kernel Stability for Model Selection in Kernel-Based Algorithms.

Rescaled Boosting in Classification.