Abstract
Machine learning service API allows model owners to monetize proprietary models by offering prediction services to third-party users. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate prediction queries and their responses to train a replica model. As countermeasures, researchers have proposed to reduce the rich API output, such as hiding the precise confidence. Nonetheless, even with response being only one bit, an adversary can still exploit fine-tuned queries with differential property to infer the decision boundary of the underlying model. In this article, we propose boundary differential privacy (BDP) against such attacks by obfuscating the prediction responses with noises. BDP guarantees an adversary cannot learn the decision boundary of any two classes by a predefined precision no matter how many queries are issued to the prediction API. We first design a perturbation algorithm called boundary randomized response for a binary model. Then we prove it satisfies <inline-formula><tex-math notation="LaTeX">$\epsilon$</tex-math><alternatives><mml:math><mml:mi>ε</mml:mi></mml:math><inline-graphic xlink:href="zheng-ieq1-3043382.gif"/></alternatives></inline-formula>-BDP, followed by a generalization of this algorithm to a multiclass model. Finally, we generalize a hard boundary to soft boundary and design an adaptive perturbation algorithm that can still work in the latter case. The effectiveness and high utility of our solution are verified by extensive experiments on both linear and non-linear models.
Highlights
T HE pervasive application of artificial intelligent has encouraged the boosting business of machine learning services, such as Microsoft Azure Face API, Google Cloud Speech-to-Text, and Amazon Comprehend
We observe that zone-less boundary differentially private layer (BDPL) performs well with logistic regression models, where we witness an extra drop of 4% on R compared with BDPL
We propose boundary differentially private layer to defend machine learning models against extraction attacks by obfuscating the query responses
Summary
T HE pervasive application of artificial intelligent has encouraged the boosting business of machine learning services, such as Microsoft Azure Face API, Google Cloud Speech-to-Text, and Amazon Comprehend. To train these high-quality machine learning models, service providers need to spend intense human labor and computation resources to acquire a large well-labeled datasets and tune training process. Once the model is extracted, an adversary can further apply model inversion attack [2] to learn the proprietary training data, compromising the privacy of data contributors Another follow-up attack on the extracted model is evasion attack [3], [4], which avoids a certain prediction result by modifying its query. A hacker modifies the executable binaries of a malware or the contents of a phishing email in order not to be detected by an antivirus or spam email filter
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Dependable and Secure Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.