Abstract

In this paper, a novel Virtual State-feedback Reference Feedback Tuning (VSFRT) and Approximate Iterative Value Iteration Reinforcement Learning (AI-VIRL) are applied for learning linear reference model output (LRMO) tracking control of observable systems with unknown dynamics. For the observable system, a new state representation in terms of input/output (IO) data is derived. Consequently, the Virtual State Feedback Tuning (VRFT)-based solution is redefined to accommodate virtual state feedback control, leading to an original stability-certified Virtual State-Feedback Reference Tuning (VSFRT) concept. Both VSFRT and AI-VIRL use neural networks controllers. We find that AI-VIRL is significantly more computationally demanding and more sensitive to the exploration settings, while leading to inferior LRMO tracking performance when compared to VSFRT. It is not helped either by transfer learning the VSFRT control as initialization for AI-VIRL. State dimensionality reduction using machine learning techniques such as principal component analysis and autoencoders does not improve on the best learned tracking performance however it trades off the learning complexity. Surprisingly, unlike AI-VIRL, the VSFRT control is one-shot (non-iterative) and learns stabilizing controllers even in poorly, open-loop explored environments, proving to be superior in learning LRMO tracking control. Validation on two nonlinear coupled multivariable complex systems serves as a comprehensive case study.

Highlights

  • Learning control from input/output (IO) system data is a current significant research area

  • One contribution of this work is to propose for the first time such a model-free framework called Virtual State-Feedback Reference Tuning (VSFRT), which learns control based on the feedback provided by the virtual state representation

  • Since the virtual state-feedback control design is attempted based on the VSFRT principle, it is known from classical Virtual State Feedback Tuning (VRFT) control that the NMP property of (1) requires special care, for simplification, it will be assumed that (1) is minimum-phase

Read more

Summary

Introduction

Learning control from input/output (IO) system data is a current significant research area. The class of offline off-policy VIRL for unknown dynamical systems is adopted, based on neural networks (NNs) function approximators, it will be coined as Approximate Iterative VIRL (AI-VIRL) For this model-free offline off-policy learning variant, a database of transition samples (or experiences) is required to learn the optimal control. This is different from the virtual environments specific, e.g., to video games [14] (or even simulated mechatronics systems), where instability leads to an episode (or simulation) termination but physical damage is not a threat Another issue with reinforcement learning algorithms such as AI-VIRL, is the state representation. One contribution of this work is to propose for the first time such a model-free framework called Virtual State-Feedback Reference Tuning (VSFRT), which learns control based on the feedback provided by the virtual state representation.

The LRMO Tracking Problem
Recapitulating VRFT for Error-Feedback IO Control
A VSFRT controller rendering VVR
There exists a set of nonlinear parameterized state-feedback continuously dif
The AI-VIRL Solution for the LRM Output Tracking
The Neural Transfer Learning Capacity
First Validation Case Study
IO Data Collected in Closed-Loop
Learning
Learning the AI-VIRL
AI-VIRL
Open-loop
VSFRT and AI-VIRL tracking when learningthe uses open-loop space
Testing the Transfer Learning Advantage
Second Validation Case Study
IO Data Collection in Closed-Loop
Learning thecollected
Each learning trains for MaxStepson
Findings
12. Open-loop
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call