3 - Behaviour of Learning Automata for Different Reinforcement Schemes

Kaddour Najim,Alexander S Poznyak

doi:10.1016/b978-0-08-042024-0.50009-6

Abstract

This chapter discusses the behavior of learning automata for different reinforcement schemes. It describes a number of recurrent reinforcement schemes for solving the problem of adaptive control of static systems. The nonprojectional algorithm of Narendra and Shapiro, of Luce, and of Varashavskii and Vomtsova, can be used to solve learning problems associated with binary loss functions. The algorithm of Luce has the highest convergence rate. However, a high convergence rate cannot be guaranteed for all average loss functions. The reinforcement scheme of Varashavskii and Vorontsova is a modification of the algorithm of Luce. With this algorithm, the widest range of average loss functions can be considered. The Bush–Mosteller reinforcement scheme can solve the adaptive control problem only when the average loss functions of the optimal strategy is equal or tends to zero. The projectional algorithms, for solving problems with continuous loss functions in the interval (-∞, ∞) are introduced in the chapter. These algorithms are significantly more complex and require the solution of a quadratic programming problem using the projection operator at each step.

Full Text