Abstract

With the vigorous development of artificial intelligence technology, various engineering technology applications have been implemented one after another. The gradient descent method plays an important role in solving various optimization problems, due to its simple structure, good stability, and easy implementation. However, in multinode machine learning system, the gradients usually need to be shared, which will cause privacy leakage, because attackers can infer training data with the gradient information. In this paper, to prevent gradient leakage while keeping the accuracy of the model, we propose the super stochastic gradient descent approach to update parameters by concealing the modulus length of gradient vectors and converting it or them into a unit vector. Furthermore, we analyze the security of super stochastic gradient descent approach and demonstrate that our algorithm can defend against the attacks on the gradient. Experiment results show that our approach is obviously superior to prevalent gradient descent approaches in terms of accuracy, robustness, and adaptability to large-scale batches. Interestingly, our algorithm can also resist model poisoning attacks to a certain extent.

Highlights

  • Gradient descent (GD) is a technique to minimize an objective function, which is parameterized by the parameters of a model, by updating the parameters with the opposite direction of the gradient of the objective function about the parameters [1]

  • stochastic gradient descent (SSGD) invalidates these attacks on the gradient model, including the attack by searching for the optimal training sample [4, 7, 8] based on minimizing the distance between the ground-truth gradient and the gradient calculated by the variable, and the attack by solving the equation system [9] to obtain the training data

  • We review some basic gradient descent algorithms [1], including batch gradient descent (BGD), stochastic gradient descent (SGD), and mini-batch gradient descent (MBGD). e difference among them is that how much data is used to calculate the gradient of the objective function. en, we describe the information leakage caused by gradients [19]

Read more

Summary

Introduction

Gradient descent (GD) is a technique to minimize an objective function, which is parameterized by the parameters of a model, by updating the parameters with the opposite direction of the gradient of the objective function about the parameters [1]. Zhao et al [4] used the properties of neural networks to recover the label of a single sample before the learning-based attack, this technique is only suitable to single-point gradient It is the same as [7] in the multisample case. Pan et al [9] used the ReLu activation function to analyze the sample data leakage from the multilayer fully connected neural network gradient and indicated that multiple samples reveal privacy. SSGD invalidates these attacks on the gradient model, including the attack by searching for the optimal training sample [4, 7, 8] based on minimizing the distance between the ground-truth gradient and the gradient calculated by the variable, and the attack by solving the equation system [9] to obtain the training data. We conclude this paper and give the further work

Preliminaries
Super Stochastic Gradient Descent
Experiments
Conclusions
Findings
Disclosure
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call