Optimal Routing for Stream Learning Systems

Xinzhe Fu,Eytan Modiano

doi:10.1109/infocom48880.2022.9796959

Xinzhe Fu, Eytan Modiano

https://doi.org/10.1109/infocom48880.2022.9796959

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Consider a stream learning system with a source and a set of computation nodes that solves a machine learning task modeled as stochastic convex optimization problem over an unknown distribution D. The source generates i.i.d. data points from D and routes the data points to the computation nodes for processing. The data points are processed in a streaming fashion, i.e., each data point can be accessed only once and is discarded after processing. The system employs local stochastic gradient descent (local SGD), where each computation node performs stochastic gradient descent locally using the data it receives from the source and periodically synchronizes with other computation nodes. Since the routing policy of the source determines the availability of data points at each computation node, the performance of the system, i.e., the optimization error obtained by local SGD, depends on the routing policy.In this paper, we study the influence of the routing policy on the performance of stream learning systems. We first derive an upper bound on the optimization error as a function of the routing policy. The upper bound reveals that the routing policy influences the performance through tuning the bias-variance trade-off of the optimization process, and gives rise to a framework for optimizing the routing policy for stream learning systems. By minimizing the upper bound, we propose an optimal static routing policy that achieves the best trade-off for stream learning systems with deterministic data generation process. We then propose a routing policy that can approximate the optimal static routing policy arbitrarily closely for systems where the data points are generated according to a stochastic process with unknown rate. Finally, we conduct simulations using Support Vector Machine as the machine learning task on a real data set, and show that the optimal static routing policy has excellent empirical performance in terms of minimizing the optimization error and the proposed stochastic routing policy closely matches the optimal static routing policy.

Full Text