Abstract

This paper considers the Linear Quadratic Regulator problem for linear systems with unknown dynamics, a central problem in data-driven control and reinforcement learning. We propose a method that uses data to directly return a controller without estimating a model of the system. Sufficient conditions are given under which this method returns a stabilizing controller with guaranteed relative error when the data used to design the controller are affected by noise. This method has low complexity as it only requires a finite number of samples of the system response to a sufficiently exciting input, and can be efficiently implemented as a semi-definite programme.

Highlights

  • Control theory is witnessing an increasing renewed interest towards data-driven control

  • This paper considers the infinite horizon Linear Quadratic Regulator (LQR) problem for linear time-invariant systems, which is one of the problems more studied in the control literature

  • Where P is the controllability Gramian of the closed-loop system (5), which is the unique solution to (A + BK)P (A + BK)⊤ − P + I = 0 (7). This corresponds in the time domain to the 2-norm of the output z when impulses are applied to the input channels, and can be interpreted as the mean-square deviation of z when d is a white process with unit covariance, which is the classic stochastic LQR formulation

Read more

Summary

Introduction

Control theory is witnessing an increasing renewed interest towards data-driven (data-based) control. Starting from [Fiechter, 1997], a tremendous effort has been made for establishing non-asymptotic properties of data-driven methods This term refers to all those methods that aim at providing closedloop stability and performance guarantees using only a finite number of data points. A strength of our method (of direct methods in general) is a parsimonious use of such priors, which allows us to cope with situations where the noise has no convenient statistics In such situations indirect methods (at least those proposed for LQR) are instead much more difficult to pursue since the ID step is strongly reliant on such statistics [Mania et al, 2019, Dean et al, 2019]. This result states that a (noise-free) system trajectory generated by a persistently exciting input is a data-based non-parametric system model.

Notation and auxiliary facts
Problem definition and data-driven formulation
A data-driven SDP formulation
Data-driven solution with noisy data
Stability and performance analysis
Preliminary discussion
Noise robustness through soft constraints
Alternative based on the S-procedure
Stability and H2-norm bounds
Bounds on the relative error
Nonlinear systems
De-noising through averaging
Random linear systems
Nonlinear inverted pendulum
Concluding remarks
Findings
A Appendix

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.