Abstract
In this paper, we identify a class of input constrained optimal control problems which can be approximately solved using Reinforcement Learning (RL) approaches. We start with a general class of problems which do not admit the theoretical assumptions used to derive RL frameworks. We then restrict this class by extra conditions on the dynamics and the objective function as deemed necessary. Our attention concerns two assumptions: (i) the smoothness of the value function which is typically not satisfied in input constrained problems, and (ii) the form of the objective function which can be more general than what has been proposed in previous formulations. For the first assumption, we use the method of vanishing viscosity to derive the conditions under which RL approaches can be used to find an approximate solution. These conditions relax a differentiability assumption to a continuity assumption of the value function thereby extending the applicability of RL frameworks. For the second assumption, we generalize the specific integrand form of the control cost used in previous formations to a more general class of cost functions that guarantee continuity of the control policy. Using these results, we present a new partially model-free RL framework for optimal control of input constrained continuous-time systems. Our RL framework requires an initial stabilizing policy and guarantees uniformly ultimate boundedness of the state variables. We demonstrate our results by simulation examples.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.