Combining importance sampling and temporal difference control variates to simulate Markov Chains

R S Randhawa,S Juneja

doi:10.1145/974734.974735

Abstract

It is well known that in estimating performance measures associated with a stochastic system a good importance sampling distribution (IS) can give orders of magnitude of variance reduction while a bad one may lead to large, even infinite, variance. In this paper we study how this sensitivity of the estimator variance to the importance sampling change of measure may be "dampened" by combining importance sampling with stochastic approximation based temporal difference (TD) method. We consider a finite state space discrete time Markov chain (DTMC) with one-step transition rewards and an absorbing set of states and focus on estimating the cumulative expected reward to absorption starting from any state. In this setting we develop sufficient conditions under which the estimate resulting from the combined approach has a mean square error that asymptotically equals zero even when the estimate formed by using only importance sampling change of measure has infinite variance. In particular, we consider the problem of estimating the small buffer overflow probability in a queuing network, where the change of measure suggested in literature is shown to have infinite variance under certain parameters and where the appropriate combination of IS and TD method can be empirically seen to have a much faster convergence rate compared to naive simulation.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Combining importance sampling and temporal difference control variates to simulate Markov Chains

Abstract

Talk to us

Similar Papers

More From: ACM transactions on modeling and computer simulation : a publication of the Association for Computing Machinery

Lead the way for us

Journal: ACM transactions on modeling and computer simulation : a publication of the Association for Computing Machinery	Publication Date: Jan 1, 2004
Citations: 30

Similar Papers

REIN: Reliability Estimation via Importance sampling with Normalizing flows
Agnimitra Dasgupta ... Erik A Johnson
Reliability Engineering & System Safety | VOL. 242
Agnimitra Dasgupta, et. al.Agnimitra Dasgupta ... Erik A Johnson
12 Oct 2023
Reliability Engineering & System Safety | VOL. 242

Empirical Studies in Action Selection with Reinforcement Learning
Shimon Whiteson ... Matthew E Taylor
Adaptive Behavior | VOL. 15
Shimon Whiteson, et. al.Shimon Whiteson ... Matthew E Taylor
01 Mar 2007
Adaptive Behavior | VOL. 15

Temporal difference method for multi-step prediction: application to power load forecasting
Jenq-Neng Hwang ... Seokyong Moon
-
Jenq-Neng Hwang, et. al. Jenq-Neng Hwang ... Seokyong Moon
23 Jul 1991
23 Jul 1991

MTD method for better prediction of sea surface temperature
V Ganapathy ... M A Kashem
International Journal of Remote Sensing | VOL. 23
V Ganapathy, et. al.V Ganapathy ... M A Kashem
01 Jan 2002
International Journal of Remote Sensing | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining importance sampling and temporal difference control variates to simulate Markov Chains

Abstract

Talk to us

Similar Papers

More From: ACM transactions on modeling and computer simulation : a publication of the Association for Computing Machinery