Towards cross-silo federated learning for corporate organizations

Saikishore Kalloori,Abhishek Srivastava

doi:10.1016/j.knosys.2024.111501

Abstract

Digital media companies rely on machine learning models to target their content toward their audience’s interests. Machine learning models usually rely on the amount and quality of training data. While today, data is abundant, it is typically stored in data silos and cannot be shared between companies or publishers due to data protection and user privacy. Federated Learning (FL) is a distributed machine learning approach that is rapidly gaining popularity and enables collaboratively training machine learning models on a large corpus of decentralized data. Prior research on FL mainly focuses on an FL setup containing millions of clients. For example, a client may be a single user’s mobile device with data. However, we note that, in many scenarios, corporate organizations such as news media companies that have available data from multiple sets of users could also benefit from FL. In this work, we aim to focus on building FL models where multiple corporate organizations like news media companies or banks participate in the training process of FL to collaboratively train federated models. We used federated learning to train models for a set of corporate stakeholders and applied FL for two tasks: a classification task and a ranking task. For the classification task, we designed a tree-based federated random forest algorithm and a neural network-based federated algorithm. For the ranking task, we designed a federated neural ranking model for news article recommendations. Our experimental results demonstrate that corporate companies by participating in FL can achieve benefits in improving the model performance in terms of accuracy for classification tasks and in terms of ranking for recommendation tasks. Furthermore, we designed and developed a simple framework for a small number of stakeholders to train federated models.

Full Text