Abstract

A paper read by the present author at the Sixth Prague Conference dealt with the following adaptive policy to control a Markov chain with unknown parameters: “The trajectory {X n , n = 0, 1, ...} is observed, and the unknown parameters are estimated at times n = 0,1,2,... Thereafter, the optimal stationary policy is computed as if the estimates were the exact values of the unknown parameters. The value of the estimated optimal stationary policy at the state X n gives the control parameter value for the transition to X n +1″. In the past three years this adaptive policy has been further investigated, since it represents a counterpart to the Bayesian approach to the adaptive control. Results of varying degree of completeness and generality were obtained for controlled Markov chains and processes with a finite state space, for processes of the diffusion type, and for discrete time linear systems ([4], [3], [2], [1]). In Sections 1 and 2 of the present paper we exemplify briefly the essential lines of the research on the case of discrete time linear systems. This makes possible a comparison with other approaches to adaptive control, since linear systems are the mostly explored subject of stochastic control theory. In Sections 3, 4, 5 we demonstrate the law of the iterated logarithm for quadratic functional under adaptive control policies based on least squares estimation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call