Incentive Mechanism Design for Distributed Coded Machine Learning

Ningning Ding,Lingjie Duan,Zhixuan Fang,Jianwei Huang

doi:10.1109/infocom42981.2021.9488672

Abstract

A distributed machine learning platform needs to recruit many heterogeneous worker nodes to finish computation simultaneously. As a result, the overall performance may be degraded due to straggling workers. By introducing redundancy into computation, coded machine learning can effectively improve the runtime performance by recovering the final computation result through the first k (out of the total n) workers who finish computation. While existing studies focus on designing efficient coding schemes, the issue of designing proper incentives to encourage worker participation is still under-explored. This paper studies the platform’s optimal incentive mechanism for motivating proper workers’ participation in coded machine learning, despite the incomplete information about heterogeneous workers’ computation performances and costs. A key contribution of this work is to summarize workers’ multi-dimensional heterogeneity as a one-dimensional metric, which guides the platform’s efficient selection of workers under incomplete information with a linear computation complexity. Moreover, we prove that the optimal recovery threshold k is linearly proportional to the participator number n if we use the widely adopted MDS codes for data encoding. We also show that the platform’s increased cost due to incomplete information disappears when worker number is sufficiently large, but it does not monotonically decrease in worker number.

Full Text