Stochastic approximation algorithms: examples

Vikram Krishnamurthy

doi:10.1017/cbo9781316471104.022

Abstract

This final chapter, presents four case studies of stochastic approximation algorithms in state/parameter estimation and modeling in the context of POMDPs. Example 1 discusses online estimation of the parameters of an HMM using the recursive maximum likelihood estimation algorithm. The motivation stems from classical adaptive control: the parameter estimation algorithm can be used to estimate the parameters of the POMDP for a fixed policy; then the policy can be updated using dynamic programming (or approximation) based on the parameters and so on. Example 2 shows that for an HMM comprised of a slow Markov chain, the least mean squares algorithm can provide satisfactory state estimates of the Markov chain without any knowledge of the underlying parameters. In the context of POMDPs, once the state estimates are known, a variety of suboptimal algorithms can be used to synthesize a reasonable policy. Example 3 shows how discrete stochastic optimization problems can be solved via stochastic approximation algorithms. In controlled sensing, such algorithms can be used to compute the optimal sensing strategy from a finite set of policies. Example 4 shows how large-scale Markov chains can be approximated by a system of ordinary differential equations. This mean field analysis is illustrated in the context of information diffusion in a social network. As a result, a tractable model can be obtained for state estimation via Bayesian filtering. We also show how consensus stochastic approximation algorithms can be analyzed using standard stochastic approximation methods. A primer on stochastic approximation algorithms This section presents a rapid summary of the convergence analysis of stochastic approximation algorithms. Analyzing the convergence of stochastic approximation algorithms is a highly technical area. The books [48, 305, 200] are seminal works that study the convergence of stochastic approximation algorithms under general conditions. Our objective here is much more modest. We merely wish to point out the final outcome of the analysis and then illustrate how this analysis can be applied to the four case studies relating to POMDPs. Consider a constant step size stochastic approximation algorithms of the form θ k +1 = θ k + ∈ H ( θ k , x k ), k = 0, 1 where { θ k } is a sequence of parameter estimates generated by the algorithm, ∈ is small positive fixed step size, and x k is a discrete-time geometrically ergodic Markov process (continuous or discrete state) with transition kernel P ( θ k ) and stationary distribution π θk .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stochastic approximation algorithms: examples

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Memory-Limited stochastic approximation for poisson subspace tracking
Liming Wang ... Yuejie Chi
-
Liming Wang, et. al.Liming Wang ... Yuejie Chi
01 Dec 2017
01 Dec 2017

On equivalence of some noise conditions for stochastic approximation algorithms
I-Jeng Wang ... E.K.P Chong
-
I-Jeng Wang, et. al. I-Jeng Wang ... E.K.P Chong
13 Dec 1995
13 Dec 1995

Convergence of stochastic approximation under general noise and stability conditions
V Tadic
-
V TadicV Tadic
10 Dec 1997
10 Dec 1997

Almost sure convergence of stochastic approximation algorithms with non-additive noise
G Yin ... Y M Zhu
International Journal of Control | VOL. 49
G Yin, et. al.G Yin ... Y M Zhu
01 Apr 1989
International Journal of Control | VOL. 49

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stochastic approximation algorithms: examples

Abstract

Talk to us

Similar Papers