In recent years, deep learning methods based on a large amount of data have achieved substantial success in numerous fields. However, with increases in regulations for protecting private user data, access to such data has become restricted. To overcome this limitation, federated learning (FL) has been widely utilized for training deep learning models without centralizing data. However, the inaccessibility of FL data and heterogeneity of the client data render difficulty in providing security and protecting the privacy in FL. In addition, the security and privacy anomalies in the corresponding systems significantly hinder the application of FL. Numerous studies have been proposed aiming to maintain the model security and mitigate the leakage of private training data during the FL training phase. Existing surveys categorize FL attacks from a defensive standpoint, but lack the efficiency of pinpointing attack points and implementing timely defenses. In contrast, our survey comprehensively categorizes and summarizes detected anomalies across client, server, and communication perspectives, facilitating easier identification and timely defense measures. Our survey provides an overview of the FL system and briefly introduces the FL security and privacy anomalies. Next, we detail the existing security and privacy anomalies and the methods of detection and defense from the perspectives of the client, server, and communication process. Finally, we address the security and privacy anomalies in non-independent identically distributed cases during FL and summarize the related research progress. This survey aims to provide a systematic and comprehensive review of security and privacy research in FL to help understand the progress and better apply FL in additional scenarios.
Read full abstract