Enhancing Function Approximation Abilities of Neural Networks by Training Derivatives.

V I Avrutskiy

doi:10.1109/tnnls.2020.2979706

Abstract

A method to increase the precision of feedforward networks is proposed. It requires prior knowledge of a target's function derivatives of several orders and uses this information in gradient-based training. Forward pass calculates not only the values of the output layer of a network but also their derivatives. The deviations of those derivatives from the target ones are used in an extended cost function, and then, the backward pass calculates the gradient of the extended cost with respect to weights, which is then used by a weights update algorithm. The most accurate approximation is obtained when the training starts with all available derivatives that are then step by step excluded from the extended cost function, starting with the highest orders up until only values are trained. Despite a substantial increase in arithmetic operations per pattern (compared with the conventional training), the method allows to obtain 140-1000 times more accurate approximation for simple cases if the total number of operations is equal. This precision also happens to be out of reach for the regular cost function. The method works well for solving differential equations with neural networks. The cost function is the deviation of the equation's residual from zero, and it can be extended by differentiating the equation itself so no prior information is required. This extension allows to solve 2-D nonlinear partial differential equation 13 times more accurately using seven times fewer grid points. The GPU-efficient algorithm for calculating the gradient of the extended cost function is proposed.

Full Text