Confidential machine learning on untrusted platforms: a survey

Sharma Sagar,Chen Keke

doi:10.1186/s42400-021-00092-8

Abstract

With the ever-growing data and the need for developing powerful machine learning models, data owners increasingly depend on various untrusted platforms (e.g., public clouds, edges, and machine learning service providers) for scalable processing or collaborative learning. Thus, sensitive data and models are in danger of unauthorized access, misuse, and privacy compromises. A relatively new body of research confidentially trains machine learning models on protected data to address these concerns. In this survey, we summarize notable studies in this emerging area of research. With a unified framework, we highlight the critical challenges and innovations in outsourcing machine learning confidentially. We focus on the cryptographic approaches for confidential machine learning (CML), primarily on model training, while also covering other directions such as perturbation-based approaches and CML in the hardware-assisted computing environment. The discussion will take a holistic way to consider a rich context of the related threat models, security assumptions, design principles, and associated trade-offs amongst data utility, cost, and confidentiality.

Highlights

Data-driven methods, e.g., machine learning and data mining, have become essential tools for numerous research and application domains
The cost of external storage and related I/O operations are critical to the cloud-side components as they are responsible for storing the encrypted data, which often is much larger than the plaintext version and cannot reside in memory
When Garbled Circuits (GC) is adopted as a primitive to implement some components, additional communication cost related to the GC protocol is significant, including the cost of transmitting the circuit and one-party’s input data obliviously to the other party (Liu et al 2015; Huang et al 2011)

Summary

Introduction

Data-driven methods, e.g., machine learning and data mining, have become essential tools for numerous research and application domains. Limited in-house resources, inadequate expertise, or collaborative/distributed processing needs force data owners (e.g., parties that collect and analyze user-generated data) to depend on somewhat untrusted platforms (e.g., cloud/edge service providers) for elastic storage and data processing. Google Cloud Platform has allowed users to include an external key manager to store encrypted data on the cloud with a third party (e.g., Fortanix) stores and manages keys off the cloud. It remains a critical challenge for data owners and cloud providers to protect confidentiality in computing, i.e., training models on (2021) 4:30 the cloud, while protecting the confidentiality of both the training data and the learned models

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Cybersecurity	Publication Date: Sep 1, 2021
Citations: 8	License type: open-access

R Discovery Prime

R Discovery Prime

Confidential machine learning on untrusted platforms: a survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cybersecurity

Lead the way for us

Similar Papers

Do You Consent to the Use of Your Biological Data for Training ML and AI Models? Online Survey Targeting Clinicians and Researchers.
Yury Rusinovich ... Volha Rusinovich
Web3 Journal: ML in Health Science | VOL. 1
Yury Rusinovich, et. al.Yury Rusinovich ... Volha Rusinovich
27 Jan 2024
Web3 Journal: ML in Health Science | VOL. 1

Disclosure control of machine learning models from trusted research environments (TRE): New challenges and opportunities
Esma Mansouri-Benssassi ... Emily Jefferson
Heliyon | VOL. 9
Esma Mansouri-Benssassi, et. al.Esma Mansouri-Benssassi ... Emily Jefferson
01 Apr 2023
Heliyon | VOL. 9

Partitioning of green-blue water fluxes around the world: ML model explainability and predictability
Daniel Althoff ... Georgia Destouni
-
Daniel Althoff, et. al.Daniel Althoff ... Georgia Destouni
28 Mar 2022
28 Mar 2022

Machine Learning as an Adjunct to Clinical Decision Making in Alcohol Dependence Treatment
Martyn Symons
-
Martyn SymonsMartyn Symons
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Confidential machine learning on untrusted platforms: a survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cybersecurity