Optimization Landscape of Policy Gradient Methods for Discrete-Time Static Output Feedback.

Jingliang Duan,Kai Zhao,Jie Li,Shengbo Eben Li,Xuyang Chen,Lin Zhao

doi:10.1109/tcyb.2023.3323316

Abstract

In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This article analyzes the optimization landscape inherent to policy gradient methods when applied to static output feedback (SOF) control in discrete-time LTI systems subject to quadratic cost. We begin by establishing crucial properties of the SOF cost, encompassing coercivity, L -smoothness, and M -Lipschitz continuous Hessian. Despite the absence of convexity, we leverage these properties to derive novel findings regarding convergence (and nearly dimension-free rate) to stationary points for three policy gradient methods, including the vanilla policy gradient method, the natural policy gradient method, and the Gauss-Newton method. Moreover, we provide proof that the vanilla policy gradient method exhibits linear convergence toward local minima when initialized near such minima. This article concludes by presenting numerical examples that validate our theoretical findings. These results not only characterize the performance of gradient descent for optimizing the SOF problem but also provide insights into the effectiveness of general policy gradient methods within the realm of reinforcement learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimization Landscape of Policy Gradient Methods for Discrete-Time Static Output Feedback.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on cybernetics

Lead the way for us

Journal: IEEE transactions on cybernetics	Publication Date: Jun 1, 2024
Citations: 2

Similar Papers

Output Feedback Stabilization for Discrete-time Systems with A Time-varying Delay
He Yong ... Wu Min
-
He Yong, et. al.He Yong ... Wu Min
01 Jul 2006
01 Jul 2006

Static output feedback control of nonlinear aeroelastic response of a slender wing
Mayuresh Patil ... Dewey Hodges
-
Mayuresh Patil, et. al.Mayuresh Patil ... Dewey Hodges
03 Apr 2000
03 Apr 2000

Output Feedback Stabilization for a Discrete-Time System With a Time-Varying Delay
Yong He ... Min Wu
IEEE Transactions on Automatic Control | VOL. 53
Yong He, et. al.Yong He ... Min Wu
01 Nov 2008
IEEE Transactions on Automatic Control | VOL. 53

Optimal guaranteed cost control of uncertain systems via static and dynamic output feedback
S.O.Reza Moheimani ... Ian R Petersen
Automatica | VOL. 32
S.O.Reza Moheimani, et. al.S.O.Reza Moheimani ... Ian R Petersen
01 Apr 1996
Automatica | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimization Landscape of Policy Gradient Methods for Discrete-Time Static Output Feedback.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on cybernetics