Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation.

Yongliang Yang,Bahare Kiumarsi,Hamidreza Modares,Chengzhong Xu

doi:10.1109/tnnls.2021.3098985

Abstract

This article presents a model-free λ -policy iteration ( λ -PI) for the discrete-time linear quadratic regulation (LQR) problem. To solve the algebraic Riccati equation arising from solving the LQR in an iterative manner, we define two novel matrix operators, named the weighted Bellman operator and the composite Bellman operator. Then, the λ -PI algorithm is first designed as a recursion with the weighted Bellman operator, and its equivalent formulation as a fixed-point iteration with the composite Bellman operator is shown. The contraction and monotonic properties of the composite Bellman operator guarantee the convergence of the λ -PI algorithm. In contrast to the PI algorithm, the λ -PI does not require an admissible initial policy, and the convergence rate outperforms the value iteration (VI) algorithm. Model-free extension of the λ -PI algorithm is developed using the off-policy reinforcement learning technique. It is also shown that the off-policy variants of the λ -PI algorithm are robust against the probing noise. Finally, simulation examples are conducted to validate the efficacy of the λ -PI algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems

Lead the way for us

Journal: IEEE transactions on neural networks and learning systems	Publication Date: Feb 1, 2023
Citations: 95

Similar Papers

Output feedback reinforcement Q-learning control for the discrete-time linear quadratic regulator problem
Syed Ali Asad Rizvi ... Zongli Lin
-
Syed Ali Asad Rizvi, et. al.Syed Ali Asad Rizvi ... Zongli Lin
01 Dec 2017
01 Dec 2017

Robust Policy Iteration for Continuous-Time Linear Quadratic Regulation
Bo Pang ... Zhong-Ping Jiang
IEEE Transactions on Automatic Control | VOL. 67
Bo Pang, et. al.Bo Pang ... Zhong-Ping Jiang
01 Jan 2021
IEEE Transactions on Automatic Control | VOL. 67

Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback.
Syed Ali Asad Rizvi ... Zongli Lin
IEEE Transactions on Cybernetics | VOL. 50
Syed Ali Asad Rizvi, et. al.Syed Ali Asad Rizvi ... Zongli Lin
03 Jan 2019
IEEE Transactions on Cybernetics | VOL. 50

Optimal output tracking control of linear discrete-time systems with unknown dynamics by adaptive dynamic programming and output feedback
Xuan Cai ... Gang Wang
International Journal of Systems Science | VOL. 53
Xuan Cai, et. al.Xuan Cai ... Gang Wang
10 Jun 2022
International Journal of Systems Science | VOL. 53

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems