Distributed Linearly Separable Computation

Kai Wan,Hua Sun,Giuseppe Caire,Mingyue Ji

doi:10.1109/tit.2021.3127910

Abstract

This paper formulates a distributed computation problem, where a master asks <inline-formula> <tex-math notation="LaTeX">${\mathsf N}$ </tex-math></inline-formula> distributed workers to compute a linearly separable function. The task function can be expressed as <inline-formula> <tex-math notation="LaTeX">${\mathsf K}_{\mathrm{ c}}$ </tex-math></inline-formula> linear combinations of <inline-formula> <tex-math notation="LaTeX">${\mathsf K}$ </tex-math></inline-formula> messages, where each message is a function of one dataset. Our objective is to find the optimal tradeoff between the computation cost (number of uncoded datasets assigned to each worker) and the communication cost (number of symbols the master must download), such that from the answers of any <inline-formula> <tex-math notation="LaTeX">${\mathsf N}_{\mathrm{ r}}$ </tex-math></inline-formula> out of <inline-formula> <tex-math notation="LaTeX">${\mathsf N}$ </tex-math></inline-formula> workers the master can recover the task function with high probability, where the coefficients of the <inline-formula> <tex-math notation="LaTeX">${\mathsf K}_{\mathrm{ c}}$ </tex-math></inline-formula> linear combinations are uniformly i.i.d. over some large enough finite field. The formulated problem can be seen as a generalized version of some existing problems, such as distributed gradient coding and distributed linear transform. In this paper, we consider the specific case where the computation cost is minimum, and propose novel achievability schemes and converse bounds for the optimal communication cost. Achievability and converse bounds coincide for some system parameters; when they do not match, we prove that the achievable distributed computing scheme is optimal under the constraint of a widely used ‘cyclic assignment’ scheme on the datasets. Our results also show that when <inline-formula> <tex-math notation="LaTeX">${\mathsf K}= {\mathsf N}$ </tex-math></inline-formula>, with the same communication cost as the optimal distributed gradient coding scheme proposed by Tandon <i>et al</i>. from which the master recovers one linear combination of <inline-formula> <tex-math notation="LaTeX">${\mathsf K}$ </tex-math></inline-formula> messages, our proposed scheme can let the master recover any additional <inline-formula> <tex-math notation="LaTeX">${\mathsf N}_{\mathrm{ r}}-1$ </tex-math></inline-formula> independent linear combinations of messages with high probability.

Full Text