Abstract

We consider a distributed learning framework, where there are a group of agents communicating with a centralized coordinator. The goal of the agents is to find the root of an operator composed of the local operators at the agents. Such a framework models many practical problems in different areas, including those in federated learning and reinforcement learning. For solving this problem, we study the popular distributed stochastic approximation. Over a series of time epoch, each agent runs a number of local stochastic approximation steps based on its own data, whose results are then aggregated at the centralized coordinator.Existing theoretical guarantees for the finite-time performance of local stochastic approximation are studied under the common assumption that the local data at each agent is sampled i.i.d. Such an assumption may not hold in many applications, where the data are temporally dependent, for example, they are sampled from some dynamical systems. In this paper, we study the setting where the data are generated from Markov random processes, which are often used to model the systems in stochastic control and reinforcement learning. Our main contribution is to characterize the finite-time performance of the local stochastic approximation under this setting. We provide explicit formulas for the rates of this method for both constant and time-varying step sizes when the local operators are strongly monotone. Our results show that these rates are within a logarithmic factor of the comparable bounds under independent data. We also provide a number of numerical simulations to illustrate our theoretical results by applying local SA in solving problems in robust identification and reinforcement learning over multi-agent systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call