Abstract

The optimal control of nonlinear stochastic systems is considered in this paper. The central role played by the Fokker-Planck-Kolmogorov equation in the stochastic control problem is shown under the assumption of asymptotic stability. A computational approach for the problem is devised based on policy iteration/ successive approximations, and a finite dimensional approximation of the control parametrized diusion operator, i.e., the controlled Fokker-Planck operator. Several numerical examples are provided to show the ecacy of the proposed computational methodology. require the study of the time evolution of the Probability Density Function (PDF), p(t,x), corresponding to the state, x, of the relevant dynamic system. The pdf is given by the solution to the Fokker-Planck- Kolmogorov equation (FPE), which is a PDE in the pdf of the system, defined by the underlying dynamical system's parameters. In this paper, approximate solutions of the FPE are considered, and subsequently leveraged for the design of controllers for nonlinear stochastic dynamical systems. In the past few years, the authors have developed a generalized multi-resolution meshless FEM methodology, partition of unity FEM (PUFEM), utilizing the recently developed GLOMAP (Global Local Orthogonal MAPpings) methodology to provide the partition of unity functions 6,7 and the orthogonal local basis functions, for the solution of the FPE. The PUFEM is a Galerkin projection method and the solution is characterized in terms of an finite dimensional representation of the Fokker-Planck operator underlying the problem. The methodology is also highly amenable to parallelization. 8-12 Though the FPE is invaluable in quantifying the uncertainty evolution through nonlinear systems, per- haps its greatest benefit may be in the stochastic analysis, design and control of nonlinear systems. In the context of nonlinear stochastic control, Markov Decision Processes have long been one of the most widely used methods for discrete time stochastic control. However, the Dynamic Programming equations under- lying the MDPs suer from the curse of dimensionality. 13-15 Various approximate Dynamic Programming (ADP) methods have been proposed in the past several years for overcoming the curse of dimensionality, 15-19 and broadly can be categorized under the category of functional reinforcement learning. These methods are essentially model free method of approximating the optimal control policy in stochastic optimal control prob- lems. These methods generally fall under the category of value function approximation methods, 15 policy gradient/ approximation methods 17,18 and actor-critic methods. 16,19 These methods attempt to reduce the dimensionality of the DP problem through a compact parametrization of the value functions (with respect to

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call