Data Driven State Reconstruction of Dynamical System Based on Approximate Dynamic Programming and Reinforcement Learning

Fabio Nogueira Da Silva,Joao Viana Da Fonseca Neto

doi:10.1109/access.2021.3080626

Fabio Nogueira Da Silva, Joao Viana Da Fonseca Neto

Open Access

https://doi.org/10.1109/access.2021.3080626

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 3	License type: CC BY 4.0

Affiliation: Universidade Federal do Maranhão

Abstract

A data-driven approximation formulation for the state reconstruction problem of dynamical systems is presented in this paper. Without the assumption of an explicit mathematical model, the Hamilton–Jacobi–Bellman (HJB) based approach of a data-driven state reconstruction design method for monitoring and output-feedback control of dynamical systems is presented. The proposed state reconstruction design is based on a dynamic programming approach. To evaluate the proposed state reconstruction, computational experiments are conducted using only dynamical system model output data. The sensitivity of the algorithm parameters is also analyzed and discussed. The performance evaluation is analyzed in terms of the error metrics of the discrete linear quadratic regulator with output feedback under the value iteration algorithm through a reinforcement learning strategy.

Highlights

Owing to a difficult access when installing sensors used for measuring states and/or state variables that do not have physical representations, relevant states are unable to be measured for a given application when monitoring and/or controlling real-world systems
These devices present a solution to the problem, allowing the development of devices consisting of sensors, micro-controllers, and embedded algorithms to synthesize state space observers based on adaptive dynamic programming (ADP) approaches [12]–[14]
In this paper, a state reconstruction method of a dynamical system based on dynamic programming and reinforcement learning approaches driven by measured data is presented

Summary

INTRODUCTION

Owing to a difficult access when installing sensors used for measuring states and/or state variables that do not have physical representations, relevant states are unable to be measured for a given application when monitoring and/or controlling real-world systems. Owing to the difficulty in measuring all states for complete feedback, state observer devices enable the application of optimal control methodologies These devices present a solution to the problem, allowing the development of devices consisting of sensors, micro-controllers, and embedded algorithms to synthesize state space observers based on adaptive dynamic programming (ADP) approaches [12]–[14]. Matrix Pstill depends on matrices A, B, and C by M0, Mu, and My. 2) TEMPORAL DIFFERENCE ERROR BASED ON MEASURED DATA The reinforcement learning algorithm based on temporal differences to determine the value function online can be defined using the Bellman temporal difference error equation for DLQR with respect to the states as follows: ek = −xkT Pxk + yTk Qyk + uTk Ruk + xkT+1Pxk+1. The advantage here is that no dynamical system model is needed to estimate the control actions, and only the measured data and tuning procedures, such as the forgetting factor, discount factor, covariance matrix RLS, and weighting matrices Q and R, are required

MATRIX APPROXIMATION AND STATE

SIMULATION AND ANALYSIS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Driven State Reconstruction of Dynamical System Based on Approximate Dynamic Programming and Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data
F L Lewis ... K G Vamvoudakis
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) | VOL. 41
F L Lewis, et. al.F L Lewis ... K G Vamvoudakis
29 Mar 2010
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) | VOL. 41

Neural Network Tracking Control of Unknown Servo System with Approximate Dynamic Programming
Yongfeng Lv ... Linwei Li
-
Yongfeng Lv, et. al.Yongfeng Lv ... Linwei Li
01 Jul 2019
01 Jul 2019

Online Optimal DLQR-DFIG Control System Design via Recursive Least-Square Approach and State Heuristic Dynamic Programming for Approximate Solution of the HJB Equation
Joao Viana Da Fonseca Neto ... Patricia H Moraes Rego
-
Joao Viana Da Fonseca Neto, et. al.Joao Viana Da Fonseca Neto ... Patricia H Moraes Rego
01 Oct 2013
01 Oct 2013

Reducing the computational effort of optimal process controllers for continuous state spaces by using incremental learning and post-decision state formulations
Melanie Senn ... Jay H Lee
Journal of Process Control | VOL. 24
Melanie Senn, et. al.Melanie Senn ... Jay H Lee
02 Feb 2014
Journal of Process Control | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Driven State Reconstruction of Dynamical System Based on Approximate Dynamic Programming and Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access