Abstract

A novel decentralized reinforcement learning robust optimal tracking control theory for time varying constrained reconfigurable modular robots based on action-critic-identifier (ACI) and state-action value function (Q-function) has been presented to solve the problem of the continuous time nonlinear optimal control policy for strongly coupled uncertainty robotic system. The dynamics of time varying constrained reconfigurable modular robot is described as a synthesis of interconnected subsystem, and continuous time state equation andQ-function have been designed in this paper. Combining with ACI and RBF network, the global uncertainty of the subsystem and the HJB (Hamilton-Jacobi-Bellman) equation have been estimated, where critic-NN and action-NN are used to approximate the optimalQ-function and the optimal control policy, and the identifier is adopted to identify the global uncertainty as well as RBF-NN which is used to update the weights of ACI-NN. On this basis, a novel decentralized robust optimal tracking controller of the subsystem is proposed, so that the subsystem can track the desired trajectory and the tracking error can converge to zero in a finite time. The stability of ACI and the robust optimal tracking controller are confirmed by Lyapunov theory. Finally, comparative simulation examples are presented to illustrate the effectiveness of the proposed ACI and decentralized control theory.

Highlights

  • Reconfigurable modular robot could transform its configuration depending on the different external situations and the requirements of the tasks

  • The fuzzy logic system is used to approximate the unknown dynamics of the subsystem, and a sliding mode controller with an adaptive scheme is designed to avoid both the interconnection term and the fuzzy approximation error

  • We presented a novel continuous time decentralized reinforcement learning robust optimal tracking control theory for the time varying constrained reconfigurable modular robot

Read more

Summary

Introduction

Reconfigurable modular robot could transform its configuration depending on the different external situations and the requirements of the tasks. Because of its adaptive optimization capability in the nonlinear model under the condition of uncertainty, reinforcement learning has a unique advantage to solve the problems of optimization strategies and the control method in terms of the complex models [9,10,11] Zhang and his team presented an infinite time optimal tracking control scheme for discrete-time nonlinear system via the greedy HDP iteration algorithm [12,13,14]. A data-driven robust approximate optimal tracking control is proposed by using the adaptive dynamic programming, as well as a data-driven model which is established by a recurrent neural network to reconstruct the unknown system dynamics by using available input-output date [15] They design a fuzzy critic estimator which is used to estimate the value function for nonlinear continuous-time system [16]. The proposed control method could compensate for the impacts of the model uncertainties and the interconnection terms on the system, so that it can make the subsystems track the desired trajectories and the tracking error can converge to zero in finite time

Problem Formulation
Simulations
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call