Learn Quasi-Stationary Distributions of Finite State Markov Chain.

Zhiqiang Cai,Ling Lin,Xiang Zhou

doi:10.3390/e24010133

Abstract

We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method.

Highlights

Quasi-stationary distribution (QSD) is the long time statistical behavior of a stochastic process that will be surely killed when this process is conditioned to survive [1]
We focus on how to compute the expression of the quasi-stationary distribution, which is denoted by α( x ) on a metric space E
Before introducing the reinforcement learning (RL) method of our QSD problem, we develop a general formulation by introducing the KL-divergence between two path distributions

Summary

Introduction

Quasi-stationary distribution (QSD) is the long time statistical behavior of a stochastic process that will be surely killed when this process is conditioned to survive [1]. Traditional numerical algebra methods can be applied to solve the quasi-stationary distribution in finite state space, for example, the power method [16], the multi-grid method [17] and Arnoldi’s algorithm [18]. These eigenvector methods can produce a stochastic vector for QSD instead of generating samples of QSD. Of using RL for rare events sampling problems, we transform the minimization of KL divergence between P and Q into the maximization of a time-averaged reward function and defined the corresponding value function V ( x ) at each state x This completes our modeling of RL for the quasi-stationary distribution problem.

Quasi-Stationary Distribution

Review of Simulation Methods for Quasi-Stationary Distribution

Learn Quasi-Stationary Distribution

Formulation of RL and Policy Gradient Theorem

Learn QSD

Actor-Critic Algorithm

Numerical Experiment

Loopy Markov Chain

Summary and Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Jan 17, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Learn Quasi-Stationary Distributions of Finite State Markov Chain.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Reinforcement Learning for Clinical Applications.
Kia Khezeli ... Benjamin Shickel
Clinical journal of the American Society of Nephrology : CJASN | VOL. 18
Kia Khezeli, et. al.Kia Khezeli ... Benjamin Shickel
08 Feb 2023
Clinical journal of the American Society of Nephrology : CJASN | VOL. 18

A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions
Rémi Munos
Machine Learning | VOL. 40
Rémi MunosRémi Munos
01 Jan 1999
Machine Learning | VOL. 40

Design of a Reinforcement Learning PID Controller
Zhe Guan ... Toru Yamamoto
Ieej Transactions on Electrical and Electronic Engineering | VOL. 16
Zhe Guan, et. al.Zhe Guan ... Toru Yamamoto
27 Jul 2021
Ieej Transactions on Electrical and Electronic Engineering | VOL. 16

Policy Gradient using Weak Derivatives for Reinforcement Learning
Sujay Bhatt ... Vikram Krishnamurthy
-
Sujay Bhatt, et. al.Sujay Bhatt ... Vikram Krishnamurthy
01 Dec 2019
01 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learn Quasi-Stationary Distributions of Finite State Markov Chain.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy