Abstract

AbstractIn this paper we present in a unified framework the gradient algorithms employed in the adaptation of linear time filters (TF) and the supervised training of (non‐linear) neural networks (NN). the optimality criteria used to optimize the parameters H of the filter or network are the least squares (LS) and least mean squares (LMS) in both contexts. They respectively minimize the total or the mean squares of the error e(k) between an (output) reference sequence d(k) and the actual system output y(k) corresponding to the input X(k). Minimization is performed iteratively by a gradient algorithm. the index k in (TF) is time and it runs indefinitely. Thus iterations start as soon as reception of X(k) begins. the recursive algorithm for the adaptation H(k – 1) → H(k) of the parameters is implemented each time a new input X(k) is observed. When training a (NN) with a finite number of examples, the index k denotes the example and it is upper‐bounded. Iterative (block) algorithms wait until all K examples are received to begin the network updating. However, K being frequently very large, recursive algorithms are also often preferred in (NN) training, but they raise the question of ordering the examples X(k).Except in the specific case of a transversal filter, there is no general recursive technique for optimizing the LS criterion. However, X(k) is normally a random stationary sequence; thus LS and LMS are equivalent when k becomes large. Moreover, the LMS criterion can always be minimized recursively with the help of the stochastic LMS gradient algorithm, which has low computational complexity.In (TF), X(k) is a sliding window of (time) samples, whereas in the supervised training of (NN) with arbitrarily ordered examples, X(k – 1) and X(k) have nothing to do with each other. When this (major) difference is rubbed out by plugging a time signal at the network input, the recursive algorithms recently developed for (NN) training become similar to those of adaptive filtering. In this context the present paper displays the similarities between adaptive cascaded linear filters and trained multilayer networks. It is also shown that there is a close similarity between adaptive recursive filters and neural networks including feedback loops.The classical filtering approach is to evaluate the gradient by ‘forward propagation’, whereas the most popular (NN) training method uses a gradient backward propagation method. We show that when a linear (TF) problem is implemented by an (NN), the two approaches are equivalent. However, the backward method can be used for more general (non‐linear) filtering problems. Conversely, new insights can be drawn in the (NN) context by the use of a gradient forward computation.The advantage of the (NN) framework, and in particular of the gradient backward propagation approach, is evidently to have a much larger spectrum of applications than (TF), since (i) the inputs are arbitrary and (ii) the (NN) can perform non‐linear (TF).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.