Abstract
The author considers the adaptive control of Markov chains under the weak accessibility condition with the objective of minimizing the learning loss. First, it is shown that, by using a stationary randomized control scheme, the maximum likelihood estimate of the unknown parameter converges exponentially fast to its true value. Then a certainty equivalence control with a forcing type scheme is constructed with alternative phases of forcing and certainty equivalence control. The stationary randomized control scheme for forcing is used in such a way that by cutting and pasting the resulting observations a single Markov chain is obtained. This in turn allows the rate of forcing to be chosen appropriately, giving a learning loss of O(f(n)log n) for any function f(n) to infinity as n to infinity .< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.