Abstract

This chapter discusses the behavior of learning automata for different reinforcement schemes. It describes a number of recurrent reinforcement schemes for solving the problem of adaptive control of static systems. The nonprojectional algorithm of Narendra and Shapiro, of Luce, and of Varashavskii and Vomtsova, can be used to solve learning problems associated with binary loss functions. The algorithm of Luce has the highest convergence rate. However, a high convergence rate cannot be guaranteed for all average loss functions. The reinforcement scheme of Varashavskii and Vorontsova is a modification of the algorithm of Luce. With this algorithm, the widest range of average loss functions can be considered. The Bush–Mosteller reinforcement scheme can solve the adaptive control problem only when the average loss functions of the optimal strategy is equal or tends to zero. The projectional algorithms, for solving problems with continuous loss functions in the interval (-∞, ∞) are introduced in the chapter. These algorithms are significantly more complex and require the solution of a quadratic programming problem using the projection operator at each step.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.