Model-Free Distributed Reinforcement Learning State Estimation of a Dynamical System Using Integral Value Functions

Babak Salamat,Lenz Belzner,Andrea M Tonello,Gerhard Elsbacher

doi:10.1109/ojcsys.2023.3250089

Babak Salamat, Lenz Belzner + Show 2 more

Open Access

https://doi.org/10.1109/ojcsys.2023.3250089

Copy DOI

Abstract

One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying DL techniques to state estimation problems. However, the process noise is absent which seems to indicate that the point-mass target must be non-maneuvering, as process noise is typically as significant as the measurement noise for tracking maneuvering targets. In this paper, we propose a continuous-time (CT) model-free or model-building distributed reinforcement learning estimator (DRLE) using an integral value function in sensor networks. The DRLE algorithm is capable of learning an optimal policy from a neural value function that aims to provide the estimation of a target point mass. The proposed estimator consists of two high pass consensus filters in terms of weighted measurements and inverse-covariance matrices and a critic reinforcement learning mechanism for each node in the network. The efficiency of the proposed DRLE is shown by a simulation experiment of a network of underactuated vertical takeoff and landing aircraft with strong input coupling. The experiment highlights two advantages of DRLE: i) it does not require the dynamic model to be known, and ii) it is an order of magnitude faster than the state-dependent Riccati equation (SDRE) baseline.

Full Text