Federated learning consists of a central aggregator and multiple clients, forming a distributed structure that effectively protects data privacy. However, since all participants can access the global model, the risk of model leakage increases, especially when unreliable participants are involved. To safeguard model copyright while enhancing the robustness and secrecy of the watermark, this paper proposes a client-side watermarking scheme. Specifically, the proposed method introduces an additional watermark class, expanding the output layer of the client model into an N+1-class classifier. The client’s local model is then trained using both the watermark dataset and the local dataset. Notably, before uploading to the server, the parameters of the watermark class are removed from the output layer and stored locally. Additionally, the client uploads amplified parameters to address the potential weakening of the watermark during the aggregation. After aggregation, the global model is distributed to the clients for local training. Through multiple rounds of iteration, the saved watermark parameters are continuously updated until the global model converges. On the MNIST, CIFAR-100, and CIFAR-10 datasets, the watermark detection rates on VGG-16 and ResNet-18 reached 100%. Furthermore, extensive experiments demonstrate that this method has minimal impact on model performance and exhibits strong robustness against pruning and fine-tuning attacks.
Read full abstract