Abstract
In this paper, a data-driven sub-optimal state feedback is designed for a continuous time linear parameter varying (LPV) system using reinforcement learning. Time-varying parameters lie in a poly top and the system matrix has an affine representation for the parameters. Two novels, on-policy and off-policy algorithms, are proposed using available data from vertex systems of polytop to minimise a performance index and admit a common Lyapunov function (CLF). A convex optimisation problem is derived for each iteration based on Lyapunov inequality. Algorithms yield stabilising feedback gain in each iteration and convergence to a common lyapunov function. We demonstrate the efficacy of the proposed method by simulation of two case studies.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have