Kakutani’s Fixed-Point Theorem and Multiplayer Discounted Stochastic Games
In this chapter, we prove Kakutani's Fixed Point Theorem, which is an extension of Brouwer's Fixed Point Theorem to correspondences (set-valued mappings). We then define the concept of $\lambda$-discounted equilibrium, and using Kakutani's Fixed Point Theorem we prove that every multiplayer stochastic game admits a stationary $\lambda$-discounted equilibrium, for every discount factor $\lambda \in (0,1]$.
- Research Article
482
- 10.1137/1009030
- Apr 1, 1967
- SIAM Review
Next article Contraction Mappings in the Theory Underlying Dynamic ProgrammingEric V. DenardoEric V. Denardohttps://doi.org/10.1137/1009030PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAbout[1] Richard Bellman, Dynamic programming, Princeton Univeristy Press, Princeton, N. J., 1957xxv+342 MR0090477 Google Scholar[2] David Blackwell, Discrete dynamic programming, Ann. Math. Statist., 33 (1962), 719–726 MR0149965 0133.12906 CrossrefISIGoogle Scholar[3] David Blackwell, Discounted dynamic programming, Ann. Math. Statist., 36 (1965), 226–235 MR0173536 0133.42805 CrossrefGoogle Scholar[4] A. Charnes and , R. G. Schroeder, On some tactical antisubmarine games, Systems Research Memorandum No. 131, The Technological Institute, Northwestern University, Evanston, Illinois, 1965 Google Scholar[5] E. V. Denardo, Masters Thesis, Sequential decision processes, Doctoral thesis, Northwestern University, Evanston, Illinois, 1965 Google Scholar[6] F. D'Epenoux, Sur un problème de production de stockage dans l'aleatoire, Rev. Française Recherche Operationelle, 14 (1960), 3–16 Google Scholar[7] Cyrus Derman, On sequential decisions and Markov chains, Management Sci., 9 (1962/1963), 16–24 MR0169685 0995.90621 CrossrefISIGoogle Scholar[8] Cyrus Derman and , Morton Klein, Some remarks on finite horizon Markovian decision models, Operations Res., 13 (1965), 272–278 MR0175636 0137.13901 CrossrefISIGoogle Scholar[9] J. H. Eaton and , L. A. Zadeh, Optimal pursuit strategies in discrete-state probabilistic systems, Trans. ASME Ser. D. J. Basic Engrg., 84 (1962), 23–29 MR0153510 CrossrefGoogle Scholar[10] L. È. Èlsgol'c, Qualitative methods in mathematical analysis, Translations of Mathematical Monographs, Vol. 12, American Mathematical Society, Providence, R.I., 1964vii+250, Trans. by A. A. Brown and J. M. Danskin MR0170048 0133.37102 CrossrefGoogle Scholar[11] B. Fox, Age replacement with discounting, Operations Res., to appear Google Scholar[12] Ronald A. Howard, Dynamic programming and Markov processes, The Technology Press of M.I.T., Cambridge, Mass., 1960viii+136 MR0118514 0091.16001 Google Scholar[13A] William S. Jewell, Markov-renewal programming. I. Formulation, finite return models, Operations Res., 11 (1963), 938–948 MR0163374 0126.15905 CrossrefISIGoogle Scholar[13B] William S. Jewell, Markov-renewal programming. II. Infinite return models, example, Operations Res., 11 (1963), 949–971 MR0163375 0126.15905 CrossrefISIGoogle Scholar[14] Samuel Karlin, The structure of dynamic programming models, Naval Res. Logist. Quart., 2 (1955), 285–294 (1956) MR0077850 CrossrefGoogle Scholar[15] L. G. Mitten, Composition principles for synthesis of optimal multistage processes, Operations Res., 12 (1964), 610–619 MR0180374 0127.36502 CrossrefISIGoogle Scholar[16] L. S. Shapley, Stochastic games, Proc. Nat. Acad. Sci. U. S. A., 39 (1953), 1095–1100 MR0061807 0051.35805 CrossrefISIGoogle Scholar[17] Lars Erik Zachrisson, M. Dresher, , L. S. Shapley and , A. W. Tucker, Markov gamesAdvances in game theory, Princeton Univ. Press, Princeton, N.J., 1964, 211–253 MR0170729 Google Scholar Next article FiguresRelatedReferencesCited byDetails Qauxi: Cooperative multi-agent reinforcement learning with knowledge transferred from auxiliary taskNeurocomputing, Vol. 504 Cross Ref Data-driven optimal control with a relaxed linear programAutomatica, Vol. 136 Cross Ref Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method24 March 2022 Cross Ref Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method12 January 2022 Cross Ref Robust Speed Control of Ultrasonic Motors Based on Deep Reinforcement Learning of a Lyapunov FunctionIEEE Access, Vol. 10 Cross Ref Data-Driven Optimal Control of Affine Systems: A Linear Programming PerspectiveIEEE Control Systems Letters, Vol. 6 Cross Ref Stochastic Dynamic Programming with Non-linear Discounting23 December 2020 | Applied Mathematics & Optimization, Vol. 84, No. 3 Cross Ref On Constructive Extractability of Measurable Selectors of Set-Valued MapsIEEE Transactions on Automatic Control, Vol. 66, No. 8 Cross Ref On the convergence of reinforcement learning with Monte Carlo Exploring StartsAutomatica, Vol. 129 Cross Ref Successive Over-Relaxation ${Q}$ -LearningIEEE Control Systems Letters, Vol. 4, No. 1 Cross Ref Affine Monotonic and Risk-Sensitive Models in Dynamic ProgrammingIEEE Transactions on Automatic Control, Vol. 64, No. 8 Cross Ref Optimal forest management under financial risk aversion with discounted Markov decision process modelsCanadian Journal of Forest Research, Vol. 49, No. 7 Cross Ref Optimizing over pure stationary equilibria in consensus stopping games2 November 2018 | Mathematical Programming Computation, Vol. 11, No. 2 Cross Ref Robust shortest path planning and semicontractive dynamic programming8 August 2016 | Naval Research Logistics (NRL), Vol. 66, No. 1 Cross Ref On the reduction of total‐cost and average‐cost MDPs to discounted MDPs25 May 2017 | Naval Research Logistics (NRL), Vol. 66, No. 1 Cross Ref Optimal Liquidation in a Level-I Limit Order Book for Large-Tick StocksAntoine Jacquier and Hao Liu5 July 2018 | SIAM Journal on Financial Mathematics, Vol. 9, No. 3AbstractPDF (845 KB)An Average Polynomial Algorithm for Solving Antagonistic Games on Graphs2 March 2018 | Journal of Computer and Systems Sciences International, Vol. 57, No. 1 Cross Ref Dynamic Programming15 February 2018 Cross Ref Dynamic Programming and Markov Decision Processes15 February 2018 Cross Ref Long-Term Values in Markov Decision Processes, (Co)Algebraically20 September 2018 Cross Ref IDENTIFICATION OF DISCRETE CHOICE DYNAMIC PROGRAMMING MODELS WITH NONPARAMETRIC DISTRIBUTION OF UNOBSERVABLES21 March 2016 | Econometric Theory, Vol. 33, No. 3 Cross Ref Dynamic Programming, Numerical15 February 2017 Cross Ref Regular Policies in Abstract Dynamic ProgrammingDimitri P. Bertsekas17 August 2017 | SIAM Journal on Optimization, Vol. 27, No. 3AbstractPDF (510 KB)Optimal Liquidation in a Level-I Limit Order Book for Large Tick StocksSSRN Electronic Journal Cross Ref Easy Affine Markov Decision Processes: TheorySSRN Electronic Journal Cross Ref Optimality of the fastest available server policy1 October 2016 | Queueing Systems, Vol. 84, No. 3-4 Cross Ref A global shooting algorithm for the facility location and capacity acquisition problem on a line with dense demandComputers & Operations Research, Vol. 71 Cross Ref Optimality of the Fastest Available Server PolicySSRN Electronic Journal Cross Ref Approximation of two-person zero-sum continuous-time Markov games with average payoff criterionOperations Research Letters, Vol. 43, No. 1 Cross Ref On variable discounting in dynamic programming: applications to resource extraction and other economic models9 August 2011 | Annals of Operations Research, Vol. 220, No. 1 Cross Ref Valuing Customer Portfolios with Endogenous Mass and Direct Marketing Interventions Using a Stochastic Dynamic Programming DecompositionMarketing Science, Vol. 33, No. 5 Cross Ref Divergence Behaviour of the Successive Geometric Mean Method of Pairwise Comparison Matrix Generation for a Multiple Stage, Multiple Objective Optimization Problem20 December 2013 | Journal of Multi-Criteria Decision Analysis, Vol. 21, No. 3-4 Cross Ref Solving multichain stochastic games with mean payoff by policy iteration Cross Ref Discounting axioms imply risk neutrality8 February 2012 | Annals of Operations Research, Vol. 208, No. 1 Cross Ref (Approximate) iterated successive approximations algorithm for sequential decision processes8 February 2012 | Annals of Operations Research, Vol. 208, No. 1 Cross Ref The multi-armed bandit, with constraints13 November 2012 | Annals of Operations Research, Vol. 208, No. 1 Cross Ref Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized DiscountingMathematics of Operations Research, Vol. 38, No. 1 Cross Ref A Dynamic Game of Reputation and Economic Performances in Nondemocratic Regimes15 June 2012 | Dynamic Games and Applications, Vol. 2, No. 4 Cross Ref Stochastic mutual induction computing in Het-CoMP empowered cellular networks Cross Ref SWITCHING AND SEQUENCING AVAILABLE THERAPIES SO AS TO MAXIMIZE A PATIENT'S EXPECTED TOTAL LIFETIME16 May 2012 | International Journal of Biomathematics, Vol. 05, No. 04 Cross Ref Multigrid methods for two-player zero-sum stochastic games17 January 2012 | Numerical Linear Algebra with Applications, Vol. 19, No. 2 Cross Ref Cooperative Access Class Barring for Machine-to-Machine CommunicationsIEEE Transactions on Wireless Communications, Vol. 11, No. 1 Cross Ref PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS30 April 2012 | International Journal of Information Technology & Decision Making, Vol. 10, No. 06 Cross Ref Approximate policy iteration: a survey and some new methods19 July 2011 | Journal of Control Theory and Applications, Vol. 9, No. 3 Cross Ref Total Expected Discounted Reward MDPS: Existence of Optimal Policies15 February 2011 Cross Ref Stationary policies with Markov partition propertyJournal of Statistics and Management Systems, Vol. 13, No. 6 Cross Ref Myopic Solutions of Homogeneous Sequential Decision ProcessesOperations Research, Vol. 58, No. 4-part-2 Cross Ref Partially observable Markov decision model for the treatment of early Prostate Cancer13 October 2010 | OPSEARCH, Vol. 47, No. 2 Cross Ref Computable Markov-perfect industry dynamicsThe RAND Journal of Economics, Vol. 41, No. 2 Cross Ref Dynamic Allocation of Scarce Resources Under Supply UncertaintySSRN Electronic Journal Cross Ref Economically Efficient Constitutional GovernanceSSRN Electronic Journal Cross Ref Applications of Metric Coinduction16 September 2009 | Logical Methods in Computer Science, Vol. 5, No. 3 Cross Ref Probabilistic models for optimizing patients survival ratesJournal of Interdisciplinary Mathematics, Vol. 11, No. 5 Cross Ref A multi-period TSP with stochastic regular and urgent demandsEuropean Journal of Operational Research, Vol. 185, No. 1 Cross Ref Four Canadian Contributions to Stochastic Modeling18 January 2017 | INFOR: Information Systems and Operational Research, Vol. 46, No. 1 Cross Ref Dynamic Programming5 December 2016 Cross Ref Financial intermediary's choice of borrowingApplied Economics, Vol. 40, No. 2 Cross Ref Optimal prepayment behaviourApplied Economics Letters, Vol. 14, No. 15 Cross Ref A structured pattern matrix algorithm for multichain Markov decision processes6 February 2007 | Mathematical Methods of Operations Research, Vol. 66, No. 3 Cross Ref Incomplete markets, labor supply and capital accumulationJournal of Monetary Economics, Vol. 54, No. 8 Cross Ref VARIATIONS ON THE THEME OF CONNING IN MATHEMATICAL ECONOMICSJournal of Economic Surveys, Vol. 21, No. 3 Cross Ref Commercial loan borrower’s optimal borrowing and prepayment decisions under uncertaintyApplied Economics, Vol. 39, No. 8 Cross Ref Risk-Sensitive and Risk-Neutral Multiarmed BanditsMathematics of Operations Research, Vol. 32, No. 2 Cross Ref Computable Markov-Perfect Industry Dynamics: Existence, Purification, and MultiplicitySSRN Electronic Journal Cross Ref Semi-Markov information model for revenue management and dynamic pricing9 March 2006 | OR Spectrum, Vol. 29, No. 1 Cross Ref A Turnpike Theorem For A Risk-Sensitive Markov Decision Process with StoppingEric V. Denardo and Uriel G. Rothblum26 July 2006 | SIAM Journal on Control and Optimization, Vol. 45, No. 2AbstractPDF (189 KB)Discounting and Risk NeutralitySSRN Electronic Journal Cross Ref Myopic Solutions of Homogeneous Sequential Decision ProcessesSSRN Electronic Journal Cross Ref Limited Attention as a Bounded on RationalitySSRN Electronic Journal Cross Ref Approximation solution and suboptimality for discounted semi-markov decision problems with countable state spaceOptimization, Vol. 53, No. 4 Cross Ref Optimal threshold probability in undiscounted Markov decision processes with a target setApplied Mathematics and Computation, Vol. 149, No. 2 Cross Ref Index Policies for Stochastic Search in a Forest with an Application to R&D Project ManagementMathematics of Operations Research, Vol. 29, No. 1 Cross Ref Recursive methods in probability control Cross Ref Optimism and overconfidence in searchReview of Economic Dynamics, Vol. 7, No. 1 Cross Ref Nonclassical Brock-Mirman EconomiesSSRN Electronic Journal Cross Ref Optimal policies in continuous time inventory control models with limited supplyComputers & Mathematics with Applications, Vol. 46, No. 7 Cross Ref Existence and Uniqueness of Solutions to the Bellman Equation in the Unbounded CaseEconometrica, Vol. 71, No. 5 Cross Ref Dynamic Airline Revenue Management with Multiple Semi-Markov DemandOperations Research, Vol. 51, No. 1 Cross Ref Finite State and Action MDPS Cross Ref Dynamic Programming Cross Ref Incomplete Markets, Labor Supply and Capital AccumulationSSRN Electronic Journal Cross Ref Overconfidence in SearchSSRN Electronic Journal Cross Ref Constrained Discounted Semi-Markov Decision Processes Cross Ref Controlled Markov Chains with Utility Functions Cross Ref Total Reward Criteria Cross Ref Is There a Curse of Dimensionality for Contraction Fixed Points in the Worst Case?Econometrica, Vol. 70, No. 1 Cross Ref SET-VALUED CONTROL LAWS IN TEV-DC CONTROL PROBLEMSIFAC Proceedings Volumes, Vol. 35, No. 1 Cross Ref Dynamic economic management of soil erosion, nutrient depletion, and productivity in the north central USA1 January 2001 | Land Degradation & Development, Vol. 12, No. 4 Cross Ref On Markov Policies for Minimax Decision ProcessesJournal of Mathematical Analysis and Applications, Vol. 253, No. 1 Cross Ref Recursive method in stochastic optimization under compound criteria Cross Ref Kulatilaka '93: The Case of a Dual Fuel Boiler: A Review, Gauss Codes and Numerical ExamplesSSRN Electronic Journal Cross Ref Kulatilaka '88 as a CVP Analysis in a Real Option Framework: A Review, Gauss Codes and Numerical ExamplesSSRN Electronic Journal Cross Ref A stochastic programming approach to manufacturing flow controlIIE Transactions, Vol. 32, No. 10 Cross Ref Chapter 5 Numerical solution of dynamic economic models Cross Ref A Theory of Constitutional Standards and Civil LibertySSRN Electronic Journal Cross Ref The one-sector growth model with idiosyncratic shocks: Steady states and dynamicsJournal of Monetary Economics, Vol. 39, No. 3 Cross Ref Pansystems optimization, generalized principles of optimality, and fundamental equations of dynamic programmingKybernetes, Vol. 26, No. 3 Cross Ref Introduction Cross Ref Stochastic Inventory Models with Limited Production Capacity and Periodically Varying Parameters27 July 2009 | Probability in the Engineering and Informational Sciences, Vol. 11, No. 1 Cross Ref A Comparison of Policy Iteration Methods for Solving Continuous-State, Infinite-Horizon Markovian Decision Problems Using Random, Quasi-random, and Deterministic DiscretizationsSSRN Electronic Journal Cross Ref On the value function in constrained control of Markov chainsMathematical Methods of Operations Research, Vol. 44, No. 3 Cross Ref Models for capacity acquisition decisions Journal of Systems, Vol. No. 3 Cross Ref capital and in of & Vol. No. 2 Cross Ref approximations for the control of a Journal of Operational Research, Vol. No. 1 Cross Ref Chapter 14 Numerical dynamic programming in Cross Ref The for Vol. 38, No. 1 Cross Ref A model of with limited Theory, Vol. 5, No. 1 Cross Ref control under Transactions on Automatic Control, Vol. 40, No. 2 Cross Ref discounted and undiscounted Markov Decision Problems June Cross Ref Game Models of Management Cross Ref Learning to dynamic Vol. No. Cross Ref AND Economic Vol. 33, No. Cross Ref May Cross Ref of linear programming for and Markovian control Methods and Models of Operations Research, Vol. 40, No. 1 Cross Ref and of equilibria in stochastic of Economic and Control, Vol. No. 2 Cross Ref optimal control of Journal of Operational Research, Vol. No. 2 Cross Ref Chapter of decision processes Cross Ref A generalized of the Theory, Vol. No. 1 Cross Ref Some structured dynamic in & Mathematics with Applications, Vol. No. Cross Ref Policy iteration and methods for Markov decision processes under average & Mathematics with Applications, Vol. No. Cross Ref Optimal control of a facility with of Optimization Theory and Applications, Vol. No. 3 Cross Ref A of in Management Cross Ref approach to dynamic of Mathematical Economics, Vol. 21, No. 1 Cross Ref Turnpike for a of in manufacturing flow of Operations Research, Vol. 29, No. 1 Cross Ref Optimal of a with Journal of Operational Research, Vol. No. 2 Cross Ref Dynamic programming and for of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref optimal algorithm for stochastic Transactions on Automatic Control, Vol. No. 8 Cross Ref A algorithm for Control Applications and Vol. 12, No. 1 Cross Ref Deterministic and Games with Cross Ref for Stochastic Games Cross Ref Markovian Decision J. July 2006 | SIAM Journal on Control and Optimization, Vol. No. and in and Production Economics, Vol. 19, No. Cross Ref Recursive and the of Economic Theory, Vol. No. 2 Cross Ref Fixed for of of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref Chapter 8 Markov decision processes Cross Ref Optimal Policies and the of April 2012 | The Journal of Vol. 44, No. 5 Cross Ref Controlled semi-markov models the discounted of and Vol. 21, No. 3 Cross Ref A for the solution of time horizon decision Stochastic Models and Analysis, Vol. 4, No. 4 Cross Ref under and state Research Vol. 35, No. 5 Cross Ref Sequential equilibria in two-person of Optimization Theory and Applications, Vol. No. 1 Cross Ref of Discrete Control August 2006 | SIAM Journal on Control and Optimization, Vol. 26, No. of linear programming to discounted Markovian decision Vol. 10, No. 3 Cross Ref Contraction undiscounted Markov decision of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref Solving Markovian decision processes by successive of of Mathematical Analysis and Applications, Vol. No. 2 Cross Ref The of and A Analysis Cross Ref On the Existence of Sequential in Markov Games Cross Ref Optimality for continuous time with Markov Application to an planning January 2006 Cross Ref Applications of methods to and Vol. 51, No. 6 Cross Ref On dynamic strategies in horizon models the of Economic & Vol. No. 3 Cross Ref Abstract Dynamic Programming Models under and H. July 2006 | SIAM Journal on Control and Optimization, Vol. No. A stochastic control October 2007 | Optimal Control Applications and Vol. No. 3 Cross Ref for dynamic programming with of Optimization Theory and Applications, Vol. 54, No. 1 Cross Ref on the of a of Mathematical Analysis and Applications, Vol. No. 2 Cross Ref on the of a Finite Markov July 2009 | Probability in the Engineering and Informational Sciences, Vol. No. 1 Cross Ref Dynamic Programming and Markov Decision November 2016 Cross Ref Optimal policies for in Stochastic Vol. No. 2 Cross Ref On the of capital of Economic Theory, Vol. 40, No. 1 Cross Ref in Markov decision of Mathematical Analysis and Applications, Vol. No. 2 Cross Ref Fixed for discounted finite decision of Mathematical Analysis and Applications, Vol. No. 2 Cross Ref Some new mathematical methods in dynamic programming over Vol. 9, No. 1 Cross Ref Approximation and in dynamic Transactions on Automatic Control, Vol. No. 3 Cross Ref On the of in Discounted Stochastic Dynamic Games Cross Ref Optimal decisions over time and an by the Bellman Vol. 7, No. Cross Ref Reward for Markov decision processes Cross Ref MARKOV DECISION Vol. 39, No. 2 Cross Ref optimal policies in inventory models with continuous July 2016 | in Applied Vol. No. 2 Cross Ref Finite state for average state Markov decision March | Vol. 7, No. 1 Cross Ref for a discounted Markov decision Processes and Applications, Vol. 19, No. 1 Cross Ref A survey on in Vol. No. 2 Cross Ref A Fixed to Markov and P. J. July 2006 | SIAM Journal on Discrete Vol. 5, No. policy iteration Research Letters, Vol. No. 5 Cross Ref Stochastic Production with Production S. P. R. and N. August 2006 | SIAM Journal on Control and Optimization, Vol. No. policies in dynamic programming: Linear programming suboptimality and Programming, Vol. No. 1 Cross Ref Optimal and control of with Research Logistics Vol. No. 2 Cross Ref Dynamic I. February 2012 | SIAM Journal on Control and Optimization, Vol. 21, No. 3AbstractPDF Optimal Control of Partially Semi-Markov Processes the Infinite Discounted Cross Ref of observable Markov decision processes linear of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref A of & Letters, Vol. 12, No. 3-4 Cross Ref Optimization of networks Markov Operations Research, Vol. 26, No. 1 Cross Ref The of discounted Markov decision July 2016 | Journal of Applied Vol. 19, No. 04 Cross Ref The of discounted Markov decision July 2016 | Journal of Applied Vol. 19, No. 4 Cross Ref the in with de de Vol. 33, No. 3 Cross Ref A of inventory of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref for the structure of optimal strategies in dynamic of Optimization Theory and Applications, Vol. No. 3 Cross Ref Finite state approximations for state horizon discounted Markov decision processes with of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref Solving equations by March 2011 | Operations Research, Vol. No. 2 Cross Ref the discounted return in and semi-markov Research Logistics Vol. No. 4 Cross Ref optimal policies for Research Logistics Vol. No. 3 Cross Ref Optimal control of Research Logistics Vol. No. 3 Cross Ref A of the of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref replacement with a Markovian July 2016 | Journal of Applied Vol. No. Cross Ref replacement with a Markovian July 2016 | Journal of Applied Vol. No. 3 Cross Ref optimal policies for structured Markov decision Journal of Operational Research, Vol. 7, No. 4 Cross Ref Markov decision problems with of Optimization Theory and Applications, Vol. No. 2 Cross Ref stopping July 2016 | Journal of Applied Vol. No. Cross Ref stopping July 2016 | Journal of Applied Vol. No. 2 Cross Ref On the convergence of successive approximations in dynamic programming with Operations Research, Vol. No. 3 Cross Ref and in generalized Research Logistics Vol. No. 1 Cross Ref Optimality in and linear Programming, Vol. No. 1 Cross Ref Economic of of Economics and Vol. 7, No. 4 Cross Ref Optimal sequential and resource under July 2016 | in Applied Vol. 12, No. 04 Cross Ref Stochastic optimal The time Transactions on Automatic Control, Vol. No. 6 Cross Ref Optimal sequential and resource under July 2016 | in Applied Vol. 12, No. 4 Cross Ref Improved of the discounted return in Markov and Operations Research, Vol. No. 5 Cross Ref Discounted Stochastic R. and P. July 2006 | SIAM Journal on Discrete Vol. No. 2AbstractPDF KB)Optimal policies for the Research Logistics Vol. 27, No. 1 Cross Ref Optimal policies for Research Logistics Vol. 27, No. 1 Cross Ref approximations for discounted Markov decision of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref and Approximation of Sequential July 2006 | SIAM Journal on Control and Optimization, Vol. No. June 2007 | Optimization, Vol. 11, No. 1 Cross Ref A method of for discounted Markov decision Operations Research, Vol. No. 7 Cross Ref A for approximations of Markov of Mathematical Analysis and Applications, Vol. 71, No. 1 Cross Ref Steady State Policies for Deterministic Dynamic July 2006 | SIAM Journal on Applied Mathematics, Vol. No. KB)Optimal policies for a of stochastic Research Logistics Vol. 26, No. 2 Cross Ref in a Generalized Markov Decision J. July 2006 | SIAM Journal on Control and Optimization, Vol. No. 2AbstractPDF and for Markov decision problems with the May | Vol. No. 1 Cross Ref Geometric convergence of in multichain Markov decision July 2016 | in Applied Vol. 11, No. Cross Ref Geometric convergence of in multichain Markov decision July 2016 | in Applied Vol. 11, No. 1 Cross Ref A survey of for some of Markov decision problems Cross Ref Successive approximations for Markov decision processes and Markov games with Optimization, Vol. 10, No. 3 Cross Ref Markov decision processes and Processes and Applications, Vol. No. 1 Cross Ref Contraction undiscounted Markov decision of Mathematical Analysis and Applications, Vol. No. 3 Cross Ref A Stochastic Game of a July 2006 | SIAM Journal on Control and Optimization, Vol. No. 3AbstractPDF KB)Optimal policies for a replacement and Control, Vol. No. 1 Cross Ref A zero-sum stochastic game model of Journal of Game Theory, Vol. 7, No. 1 Cross Ref AND IN MARKOV DECISION A Cross Ref THE OF by Cross Ref DYNAMIC PROGRAMMING IN by Cross Ref DYNAMIC by Cross Ref ON OF DYNAMIC Cross Ref OF DYNAMIC PROGRAMMING Cross Ref Cross Ref Successive for stochastic Cross Ref Mappings with Application in Dynamic ProgrammingDimitri P. July 2006 | SIAM Journal on Control and Optimization, Vol. No. 3AbstractPDF programming by successive approximations with to of Mathematical Analysis and Applications, Vol. 58, No. 2 Cross Ref On the Optimality of Policies in Decision and M. and L. July 2006 | SIAM Journal on Applied Mathematics, Vol. 32, No. 2AbstractPDF and Markov Programming Cross Ref Theory and Markovian Decision Chains Cross Ref A of successive methods for discounted Markovian decision Operations Research, Vol. No. 5 Cross Ref The on optimal of in labor in the of Economic Theory, Vol. 13, No. 1 Cross Ref On for successive Transactions on Automatic Control, Vol. 21, No. 3 Cross Ref The on optimal of in labor in the May Cross Ref Cross Ref A of the of the in Dynamic July 2007 | A Transactions, Vol. No. 1 Cross Ref of in dynamic Transactions on Automatic Control, Vol. No. 3 Cross Ref of some of dynamic and Control, Vol. 27, No. 4 Cross Ref and finite state Markovian decision of Mathematical Analysis and Applications, Vol. 49, No. 3 Cross Ref Discounted decision linear programming and policy Vol. 29, No. 1 Cross Ref On the of Dynamic Programming Cross Ref Introduction to Dynamic from E. V. Denardo and L. G. Mitten, of Sequential Decision Journal of Engineering Cross Ref in Operations Research, Vol. No. 3 Cross Ref Optimal capital under of Economic Theory, Vol. No. 2 Cross Ref of optimization problems and decision of Computer and Sciences, Vol. No. 1 Cross Ref A Class of Markovian Decision Processes Cross Ref Optimal Control of Queueing Systems Cross Ref of dynamic of Mathematical Analysis and Applications, Vol. 43, No. 3 Cross Ref stochastic July 2016 | Journal of Applied Vol. 10, No. Cross Ref stochastic July 2016 | Journal of Applied Vol. 10, No. 3 Cross Ref Optimal policies for a in to stochastic Research Logistics Vol. No. 2 Cross Ref information and decision and Vol. 9, No. 5 Cross Ref decision Research Logistics Vol. No. 1 Cross Ref dynamic of Optimization Theory and Applications, Vol. 11, No. 3 Cross Ref of a Markovian decision problem by successive Operations Research, Vol. No. 1 Cross Ref the of in Vol. No. 1 Cross Ref for optimization and Control, Vol. 21, No. 5 Cross Ref approximations to dynamic of Mathematical Analysis and Applications, Vol. No. 3 Cross Ref On a of optimal policies in continuous time Markovian decision of Mathematical Analysis and Applications, Vol. No. 1 Cross Ref Applications of Metric Cross Ref games Cross Ref Dynamic programming of stochastic networks with Cross Ref and in decision processes Cross Ref optimal algorithm for continuous state time stochastic control Cross Ref games Cross Ref approach to optimization of discounted stochastic continuous-time Cross Ref Finite state continuous time Markov decision processes with an planning of Mathematical Analysis and Applications, Vol. No. 3 Cross Ref Markov V. Denardo and B. L. July 2006 | SIAM Journal on Applied Mathematics, Vol. No. 3AbstractPDF of Stationary Optimal Policies for Some Markov July 2006 | SIAM Review, Vol. 9, No. 3AbstractPDF Programming by Linear August 2006 | SIAM Journal on Applied Mathematics, Vol. 14, No. 9, August July 2006 for and Applied & for and Applied Mathematics
- Book Chapter
13
- 10.1007/978-94-011-3760-7_14
- Jan 1, 1991
We consider stochastic games with uncountable state space and prove the existence of a Nash equilibrium in Markovian strategies, reached as a limit, in an appropriate sense, of finite horizon Markovian equilibria (the latter exist even with compact metric action spaces). Our approach is based on Glicksberg’s fixed-point theorem, basic measurable selection results and a generalized Fatou Lemma.
- Research Article
- 10.24908/iqurcp19805
- Aug 29, 2025
- Inquiry@Queen's Undergraduate Research Conference Proceedings
In the context of multi-agent systems with decentralized information structures, we study rigorously justified convergence results and associated learning algorithms that converge to equilibria. With this objective in mind, we first review classical equilibrium results, focusing on finite-player games with pure or mixed strategy sets. Results such as Kakutani’s fixed-point theorem and Sion’s minimax theorem establish existence under relatively broad conditions. Building on this background, we then study learning dynamics, including best and better response processes, in which players periodically revise and update strategies to optimize payoffs relative to their previous actions via a policy revision process. This induces a graph on the set of policies which facilitate our mathematical approach which combines graph theory, game theory, stochastic control, and Markov processes. While learning using best/better response dynamics converges under certain conditions reported in Arslan et.al, a new approach to policy revision, termed as satisficing (which may be viewed as a win-stay, lose-shift algorithm), introduced by Yongacoglu et.al provides a strictly richer graph network structure and is applicable to a much broader class of games. In particular, these generalize weakly acyclic games. The question we studied is to precisely characterize the set of games for which such a satisficing process ensures convergence to equilibrium. In particular, we addressed an open question raised by Yongacoglu et al. on necessary and sufficient conditions for convergence to equilibria from any initial policy profile. On sufficiency, we presented a generalization, relaxing requirements to allow multiple pure Nash equilibria, provided at least one is strict and subgame-unique. Our research also presented a nontrivial example of a game that admits a strict pure Nash equilibrium in each induced subgame that fails to converge via satisficing paths, showing that such conditions are insufficient, thus also leading to a necessity condition.
- Preprint Article
- 10.20944/preprints202504.1788.v1
- Apr 21, 2025
- Preprints.org
This study presents the algorithm - Victoria - an approach that demonstrates there are parameters φ, k, j considered optimal that guarantee the player will always have an advantage over the house in sports betting field in the medium and long run with guaranteed satisfactory profits. After n Small Blocks (j n ) and Intermediate Blocks (IBs) containing k independent events with the same probability p, we conclude that the cost-benefit ratio over the value in a sequence of independent events β (success block) > ζ (failure block) is always the case. Taking into account the possible impacts of Victoria on Decision Theory as well as Game Theory, a function η(X t ) called “Predictable Random Component” was also observed and presented. The η(X t ) function (or fv(X t ) in the context of VNAE) refers to the fact that within a game in which the randomness factor in a uniform distribution is crucial to it, any player who has advanced knowledge of randomness added to other additional actions, whether with the support of statistics, mathematical, physical operations and/or other cognitive actions, will be able to determine an optimal strategy whose results of the expected value of the player's payoff will always be positive regardless of what happens after n sequences determined by the player. In addition, the possibility of the existence of a new equilibrium was also observed, thus resulting in the Victoria-Nash Asymmetric Equilibrium (VNAE) theorization. We develop a rigorous statistical foundation, incorporating Markov processes, Brouwer’s fixed-point theorem, and statistics convergence to validate the existence of asymmetrical advantages in structured random systems. And anchored by the Stirling Numbers, the Law of Large Numbers, the Central Limit Theorem, Kelly's Criterion, Renewal Theory, Unified Neutral Theory of Biodiversity, Nash Equilibrium and Monte Carlo simulation itself, for example, the proposed new equilibrium is expected to be a solid mathematical model suitable for modeling games in which one of the players tends to have asymmetric advantages. In this sense, VNAE is an extension of the classic Nash Equilibrium, Stackelberg Equilibrium, and Bayesian Equilibrium. Victoria has shown that by understanding the general behavior of randomness through statistics, we can, in a way, partially “predict” the future and shape it in our favor. Furthermore, in Game Theory, it is hoped that the impact could be relevant to better understanding and adapting concepts such as stochastic games, asymmetric games, zero-sum games, repeated games and imperfect information games, for example. By bridging gaps between theory and real-world applications, this work positions the VNAE as a foundational tool for interdisciplinary advancements in decision-making under uncertainty.
- Research Article
1
- 10.1080/02331934.2021.1871612
- Jan 16, 2021
- Optimization
This paper attempts to study nonzero-sum continuous-time constrained average stochastic games with independent state processes. In these game models, each player independently controls a continuous-time Markov chain, but players are coupled by the immediate cost functions. The transition rates and immediate cost functions are allowed to be unbounded. Each player wants to minimize certain expected average cost, but constraints are imposed on other expected average costs. By introducing the average occupation measures, we establish the one-to-one relationship of constrained Nash equilibria and the fixed points of certain multifunction defined on the product space of average occupation measures. Then, by using the fixed point theorem, we show the existence of constrained Nash equilibria. Finally, we show that each stationary Nash equilibrium corresponds to a global minimizer of a certain mathematical program.
- Research Article
217
- 10.1214/aoms/1177693059
- Dec 1, 1971
- The Annals of Mathematical Statistics
We introduce a sequential competitive decision process that is a generalization of noncooperative finite games and of two-person zero-sum stochastic games (hence, of Markovian decision processes). We prove the existence of equilibrium points under criteria of discounted gain and of average gain. Two person zero-sum stochastic games and noncooperative finite games were introduced in elegant papers by Shapley [22] and Nash [16], [17]. Shapley's work prompted a series of papers [1], [4], [5], [10], [11], [12], [14], [18], [26] concerned with the existence of minimax solutions and algorithms for their computation. Even for the two-person zero-sum case, no finite algorithm yet exists. Nash's papers led to a sizeable literature in both mathematics and economics. Mills' [15] work, for example, is related to our characterization of equilibrium points in Section 4. Noncooperative stochastic games may yield fruitful models for several phenomena in the social sciences. Theories of economic markets, for example, have increasingly sought to encompass sequential economic decision processes. Some recent research in social psychology has taken an analogous direction [19], [25]. I became aware of recent work by Rogers [20] shortly after completing this paper. His results and ours nearly coincide with our Theorem 2 being slightly stronger than the comparable results in his paper. The basic difference between the papers is that Rogers relies on the Kakutani fixed point theorem whereas we use Brouwer's theorem. Our arguments are somewhat simpler as a consequence.
- Research Article
7
- 10.1007/s13160-019-00397-9
- Nov 13, 2019
- Japan Journal of Industrial and Applied Mathematics
In this article, we study a discounted stochastic game to model resource optimal intrusion detection in wireless sensor networks. To address the problem of uncertainties in various network parameters, we propose a globalized robust game-theoretic framework for discounted robust stochastic games. A robust solution to the considered problem is an optimal point that is feasible for all realizations of data from a given uncertainty set. To allow a controlled violation of the constraints when the parameters move out of the uncertainty set, the concept of globalized robust framework comes into view. In this article, we formulate a globalized robust counterpart for the discounted stochastic game under consideration. With the help of globalized robust optimization, a concept of globalized robust Markov perfect equilibrium is introduced. The existence of such an equilibrium is shown for a discounted stochastic game when the number of actions of the players is finite. The contraction mapping theorem, Kakutani fixed point theorem and the concept of equicontinuity are used to prove the existence result. To compute a globalized robust Markov perfect equilibrium for the considered discounted stochastic game, a tractable representation of the proposed globalized robust counterpart is also provided. Using the derived tractable representation, we formulate a globalized robust intrusion detection system for wireless sensor networks.
- Research Article
- 10.14736/kyb-2019-1-0152
- Mar 14, 2019
- Kybernetika
The main objective of this paper is to find structural conditions under which a stochastic game between two players with total reward functions has an $\epsilon$-equilibrium. To reach this goal, the results of Markov decision processes are used to find $\epsilon$-optimal strategies for each player and then the correspondence of a better answer as well as a more general version of Kakutani's Fixed Point Theorem to obtain the $\epsilon$-equilibrium mentioned. Moreover, two examples to illustrate the theory developed are presented.
- Research Article
- 10.14708/ma.v41i1.388
- Nov 9, 2013
- Mathematica Applicanda
On ”Games and Dynamic Games” by A. Haurie, J.B. Krawczyk and G. Zaccour
- Research Article
1
- 10.1186/s13663-020-00681-1
- Sep 22, 2020
- Fixed Point Theory and Applications
We establish a fixed point theorem for the composition of nonconvex, measurable selection valued correspondences with Banach space valued selections. We show that if the underlying probability space of states is nonatomic and if the selection correspondences in the composition are K-correspondences (meaning correspondences having graphs that contain their Komlos limits), then the induced measurable selection valued composition correspondence takes contractible values and therefore has fixed points. As an application we use our fixed point result to show that all nonatomic uncountable-compact discounted stochastic games have stationary Markov perfect equilibria – thus resolving a long-standing open question in game theory.
- Research Article
3
- 10.3934/math.2021668
- Jan 1, 2021
- AIMS Mathematics
<abstract><p>In this paper, two-person zero-sum Markov games with Borel state space and action space, unbounded reward function and state-dependent discount factors are studied. The optimal criterion is expected discount criterion. Firstly, sufficient conditions for the existence of optimal policies are given for the two-person zero-sum Markov games with varying discount factors. Then, the existence of optimal policies is proved by Banach fixed point theorem. Finally, we give an example for reservoir operations to illustrate the existence results.</p></abstract>
- Research Article
8
- 10.1016/0165-0114(94)90045-0
- Dec 1, 1994
- Fuzzy Sets and Systems
The ϵ-equilibrium in transportation networks