This paper develops approximation and optimality results for the optimal control of a networked system. In this system, a Markovian process is managed over a finite-rate, noiseless communication channel. Solving this problem involves determining joint optimal coding and control policies that minimize cost over time. While theoretical results on optimal coding and control structures exist, practical implementation has remained largely infeasible for non-linear systems due to the computational complexities and uncountable state spaces. This research introduces an approach that combines structural results with reinforcement learning (RL) techniques. The method uses regularity properties of the system to approximate the uncountable state space with a countable one, which allows reinforcement learning algorithms to achieve near-optimal solutions. Specifically, we establish that finite model approximations (where infinite state spaces are quantized to finite ones) and sliding finite window approximations (where a finite memory "window" of past control actions is maintained at each time step) can be employed to develop near-optimality. These approximations allow the system to be reformulated as a Markov Decision Problem (MDP) with a finite state space, making RL algorithms implementable. The convergence of the reinforcement learning algorithm to a near-optimal policy (under both the previous approximations) is supported by theoretical analysis and performance simulations. We ensure that the solutions obtained are not only computationally feasible but also nearly optimal with respect to the original problem. The applications of this work extend to many networked control systems, especially those with zero-delay coding and partially observable Markov decision processes (POMDPs). By integrating structural results with learning algorithms, this paper provides a practical framework for implementing optimal control in finite-rate environments.
Read full abstract