Abstract

Federated learning allows a large number of resource-constrained clients to train a globally-shared model together without sharing local data. These clients usually have only a few classes (categories) of data for training, where the data distribution is non-iid (not independent identically distributed). In this article, we put forward the concept of <i>category privacy</i> for the first time to indicate <i>which classes of data a client has</i> , which is an important but ignored privacy goal in the federated learning with non-iid data. Although secure aggregation protocols are designed for federated learning to protect the input privacy of clients, we perform the first systematic study on <i>category inference attack</i> and demonstrate that these protocols cannot fully protect category privacy. We design a differential selection strategy and two de-noising approaches to achieve the attack goal successfully. In our evaluation, we apply the attack to non-iid federated learning settings with various datasets. On MNIST, CIFAR-10, AG_news, and DBPedia dataset, our attack achieves <inline-formula><tex-math notation="LaTeX">$&gt;90\%$</tex-math></inline-formula> accuracy measured in F1-score in most cases. We further consider a possible detection method and propose two strategies to make the attack more inconspicuous.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.