Abstract

Motivated by the problem of the asymptotic behavior of a class of actor-critic learning algorithms recently proposed by Konda and Tsitsiklis, the almost sure asymptotic properties of two time-scale stochastic approximation are analyzed under violated Kushner-Clark noise conditions and very general stability conditions. The analysis covers the algorithms with additive noise, as well as those with nonadditive noise. The algorithms with additive noise are analyzed for the case where the noise is state-dependent. The analysis of the algorithms with non-additive state-dependent noise is carried out for the case where the noise is a Markov chain controlled by the algorithm states, while the algorithms with non-additive exogenous noise are analyzed for the case where the noise is correlated and satisfies uniform or strong mixing conditions. The obtained results characterize the robustness of two time-scale stochastic approximation towards the violation of the Kushner-Clark noise condition. Moreover, they cover a fairly broad class of highly non-linear two time-scale stochastic approximation algorithms, including the actor-critic learning algorithms proposed by Konda and Tsitsiklis.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.