Machine Learning Models that Remember Too Much

Congzheng Song,Vitaly Shmatikov,Thomas Ristenpart

doi:10.1145/3133956.3134077

Abstract

Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does \emph{not} observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that memorize information about the training dataset in the model\textemdash yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine Learning Models that Remember Too Much

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Review of Machine Learning Techniques in Soft Tissue Biomechanics and Biomaterials.
Samir Donmazov ... Eda Nur Saruhan
Cardiovascular engineering and technology | VOL. -
Samir Donmazov, et. al.Samir Donmazov ... Eda Nur Saruhan
02 Jul 2024
Cardiovascular engineering and technology | VOL. -

Development of a Novel, Potentially Universal Machine Learning Algorithm for Prediction of Complications After Total Hip Arthroplasty
Akash A Shah ... Nelson F Soohoo
The Journal of arthroplasty | VOL. 36
Akash A Shah, et. al.Akash A Shah ... Nelson F Soohoo
30 Dec 2020
The Journal of arthroplasty | VOL. 36

Survey of Machine Learning Applications of Convolutional Neural Networks to Medical Image Analysis
Dr K Naveen Kumar
International Journal for Research in Applied Science and Engineering Technology | VOL. 9
Dr K Naveen KumarDr K Naveen Kumar
30 Nov 2021
International Journal for Research in Applied Science and Engineering Technology | VOL. 9

Machine learning-driven identification of novel patient factors for prediction of major complications after posterior cervical spinal fusion.
Akash A. Shah ... Changhee Lee
European spine journal : official publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society | VOL. 31
Akash A. Shah, et. al.Akash A. Shah ... Changhee Lee
15 Aug 2021
15 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning Models that Remember Too Much

Abstract

Talk to us

Similar Papers