In Federated Deep Learning (FDL), multiple local enterprises are allowed to train a model jointly. Then, they submit their local updates to the central server, and the server aggregates the updates to create a global model. However, trained models usually perform worse than centralized models, especially when the training data distribution is non-independent and identically distributed (non-IID). Because non-IID data harms the accuracy and performance of the model. Second, due to the centrality of federated learning (FL) and the untrustworthiness of enterprises, traditional FL solutions are vulnerable to security and privacy attacks. Therefore, to tackle this issue, we propose FEDANIL, a secure blockchain-enabled Federated Deep LeArning Model that improves enterprise models’ decentralized, performance, and tamper-proof properties, including two main phases. The first phase is proposed to address the non-IID challenge (label and feature distribution skew). In this phase, local models with similar data distributions are grouped into homogeneous clusters using the cosine similarity (CS) and affinity propagation (AP) techniques. Then, for each homogeneous cluster, Wasserstein Generative Adversarial Networks (WGAN) are used to deal with label and feature distribution skew. The second phase was adopted to address security and privacy concerns against poisoning and inference attacks via three steps. In the first step, data poisoning attacks are prevented by using CS. Then, in the second step, collude attacks were prevented by randomly selecting enterprises in the consortium blockchain. Finally, in the third step, model poisoning, membership inference, and reconstruction attacks have been prevented using the CKKS Fully Homomorphic Encryption (CKKS-FHE) technique and consortium blockchain. Extensive experiments were conducted using the Sent140, Fashion-MNIST, FEMNIST, and CIFAR-10 new real-world datasets to evaluate FedAnil’s robustness and performance. The simulation results demonstrate that FEDANIL satisfies FDL privacy-preserving requirements. In terms of convergence analysis, the model parameter obtained with the FEDANIL converges to the optimum of the model parameter. In addition, it performs better in terms of accuracy (more than 11, 15, and 24%) and computation overhead (less than 8, 10, and 15%) compared with baseline approaches, namely SHIELDFL, RVPFL, and RFA, respectively. The FEDANIL source code can be found on GitHub.11Code available on GitHub Repository: https://github.com/rezafotohi/FedAnil. For any questions about the code, please contact Fotohi.reza@gmail.com.
Read full abstract