Abstract

Federated learning (FL) enables multiple clients to collaboratively build a global learning model without sharing their own raw data for privacy protection. Unfortunately, recent research still found privacy leakage in FL, especially on image classification tasks, such as the reconstruction of class representatives. Nevertheless, such analysis on image classification tasks is not applicable to uncover the privacy threats against natural language processing (NLP) tasks, whose records composed of sequential texts cannot be grouped as class representatives. The finer (record-level) granularity in NLP tasks not only makes it more challenging to extract individual text records, but also exposes more serious threats. This article presents the first attempt to explore the record-level privacy leakage against NLP tasks in FL. We propose a framework to investigate the exposure of the records of interest in federated aggregations by leveraging the perplexity of language modeling. Through monitoring the exposure patterns, we propose two correlation attacks to identify the corresponding clients when extracting their specific records. Extensive experimental results demonstrate the effectiveness of the proposed attacks. We have also examined several countermeasures and shown that they are ineffective to mitigate such attacks, and hence further research is expected.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.