Vertical federated learning (VFL) enables collaborative machine learning on vertically partitioned data with privacy-preservation. Most VFL methods face three daunting challenges in real-world applications. First, most existing VFL methods assume that at least one party holds the complete set of labels of all data samples. However, this assumption often violates the nature of many practical scenarios, where the parties only have partial labels. Second, the heterogeneity and dynamic of computational and communication resources in participated parties may cause the straggler problem and slow down training convergence. Third, the confidential label information could be exposed through malicious parties during VFL. To address these challenges, we propose a novel VFL algorithm named Cascade Vertical Federated Learning (CVFL), in which partitioned labels can be fully utilized to train neural networks with privacy-preservation. To mitigate the straggler problem, we design a novel optimization objective to increase straggler's contribution to the trained models. To mitigate the label privacy risks, we design a novel defense approach to protect the label privacy of CVFL. We conduct comprehensive experiments and the results demonstrate the effectiveness and efficiency of CVFL. Further, the proposed defense approach can achieve a better tradeoff between label privacy and model utility than two widely-used defense approaches.
Read full abstract