Because of the powerful computing and storage capability in cloud computing, machine learning as a service (MLaaS) has recently been valued by the organizations for machine learning training over some related representative datasets. When these datasets are collected from different organizations and have different distributions, multi-task learning (MTL) is usually used to improve the generalization performance by scheduling the related training tasks into the virtual machines in MLaaS and transferring the related knowledge between those tasks. However, because of concerns about privacy breaches (e.g., property inference attack and model inverse attack), organizations cannot directly outsource their training data to MLaaS or share their extracted knowledge in plaintext, especially the organizations in sensitive domains. In this article, we propose a novel privacy-preserving mechanism for distributed MTL, namely NOInfer, to allow several task nodes to train the model locally and transfer their shared knowledge privately. Specifically, we construct a single-server architecture to achieve the private MTL, which protects task nodes’ local data even if n-1 out of n nodes colluded. Then, a new protocol for the Alternating Direction Method of Multipliers (ADMM) is designed to perform the privacy-preserving model training, which resists the inference attack through the intermediate results and ensures that the training efficiency is independent of the number of training samples. When releasing the trained model, we also design a differentially private model releasing mechanism to resist the membership inference attack. Furthermore, we analyze the privacy preservation and efficiency of NOInfer in theory. Finally, we evaluate our NOInfer over two testing datasets and evaluation results demonstrate that NOInfer efficiently and effectively achieves the distributed MTL.
Read full abstract