Abstract

We developed two online data-driven methods for estimating an objective function in continuous-time linear and nonlinear deterministic systems. The primary focus addressed the challenge posed by unknown input dynamics (control mapping function) in the expert system, a critical element for an online solution of the problem. Our methods leverage both the learner’s and expert’s data for effective problem-solving. The first approach, which is model-free, estimates the expert’s policy and integrates it into the learner agent to approximate the objective function associated with the optimal policy. The second approach estimates the input dynamics from the learner’s data and combines it with the expert’s input-state observations to tackle the objective function estimation problem. Compared to other methods for deterministic systems that rely on both the learner’s and expert’s data, our approaches offer reduced complexity by eliminating the need to estimate an optimal policy after each objective function update. We conduct a convergence analysis of the estimation techniques using Lyapunov-based methods. Numerical experiments validate the effectiveness of our developed methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.