Abstract

This paper considers a class of reinforcement-based learning (namely, perturbed learning automata ) and provides a stochastic-stability analysis in repeatedly played, positive-utility, finite strategic-form games. Prior work in this class of learning dynamics primarily analyzes asymptotic convergence through stochastic approximations, where convergence can be associated with the limit points of an ordinary-differential equation (ODE). However, analyzing global convergence through an ODE-approximation requires the existence of a Lyapunov or a potential function, which naturally restricts the analysis to a fine class of games. To overcome these limitations, this paper introduces an alternative framework for analyzing asymptotic convergence that is based upon an explicit characterization of the invariant probability measure of the induced Markov chain. We further provide a methodology for computing the invariant probability measure in positive-utility games, together with an illustration in the context of coordination games.

Highlights

  • Multi-agent formulations have been utilized to tackle distributed optimization problems, since communication and computational complexity might be an issue under centralized schemes

  • We provide a stochasticstability analysis that provides a detailed characterization of the invariant probability measure of the induced Markov chain

  • We considered a class of reinforcement-based learning dynamics that belongs to the family of discrete-time replicator dynamics and learning automata, and we provided an explicit characterization of the invariant probability measure of the induced Markov chain

Read more

Summary

INTRODUCTION

Multi-agent formulations have been utilized to tackle distributed optimization problems, since communication and computational complexity might be an issue under centralized schemes. Each agent cannot access the actions selected or utilities received by other agents In such repeatedly-played strategic-form games, a popular objective for payoff-based learning is to guarantee convergence (in some sense) to Nash equilibria. Reinforcement-based learning has been utilized in strategicform games in order for agents to gradually learn to play Nash equilibria It may appear under alternative forms, including discrete-time replicator dynamics [5], learning automata [6], [7] and Q-learning [8]. We illustrate this methodology in the context of coordination games and provide a simulation study in distributed network formation This illustration is of independent interest since it extends prior work in coordination games under reinforcement-based learning, where convergence to mixed strategy profiles may only be excluded under strong conditions in the utility function (e.g., existence of a potential function).

Terminology
Perturbed Learning Automata
Related work
Contributions
STOCHASTIC STABILITY
Stochastic stability
Discussion
Unperturbed Process
Perturbed process
STOCHASTICALLY STABLE STATES
Background on finite Markov chains
Approximation of one-step transition probability
Approximation of stationary distribution
Stochastically stable states
Simulation study in distributed network formation
CONCLUSIONS & FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call