Privacy Risk Assessment of Training Data in Machine Learning

Yang Bai,Chuangmin Xie,Yu Li,Mingyu Fan

doi:10.1109/icc45855.2022.9839062

Abstract

With the wide application of machine learning technology, more and more sensitive data were used to develop the machine learning model. Unfortunately, machine learning systems have been shown to be vulnerable to various privacy leakage attacks, such as membership inference attacks and model inversion attacks. These privacy threats break down the confidentiality of the training data; what’s more, the leakage of sensitive privacy will reduce the enthusiasm of data owners for sharing data with the machine learning system. Without enough training data, more seriously, it will hinder the application of machine learning. Therefore, it requires analyzing strategies for privacy risks to help data owners pre-assess the candidate dataset and assist data owners in implementing reasonable privacy control. However, the systematic privacy risk assessment is still absent from the data owner’s perspective.This paper investigates and analyzes machine learning privacy risks to understand the relationship between training data properties and privacy leakage. Based on this recognition, we introduce a privacy risk assessment scheme based on the clustering distance of training data. Our clustering distance-based method can reflect the privacy risk level of a different individual data record. And then, we combine both existing privacy analysis based on data features and our clustering distance-based method to investigate privacy risks systematically. Our experiments showed that our clustering distance method and other set properties are tightly related to privacy leakage. And data owners can pre-assess the privacy risks of candidate datasets before uploading or sharing them to the machine learning model. If needed, decide to re-choose the dataset to reduce privacy risks to an acceptable level.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Privacy Risk Assessment of Training Data in Machine Learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

When Machine Unlearning Jeopardizes Privacy
Min Chen ... Zhikun Zhang
-
Min Chen, et. al.Min Chen ... Zhikun Zhang
12 Nov 2021
12 Nov 2021

Zero-Shot Machine Unlearning
Vikram S Chundawat ... Ayush K Tarun
IEEE Transactions on Information Forensics and Security | VOL. 18
Vikram S Chundawat, et. al.Vikram S Chundawat ... Ayush K Tarun
01 Jan 2023
IEEE Transactions on Information Forensics and Security | VOL. 18

Tool Support for Improving Software Quality in Machine Learning Programs
Kwok Sun Cheng ... Pei-Chi Huang
Information | VOL. 14
Kwok Sun Cheng, et. al.Kwok Sun Cheng ... Pei-Chi Huang
16 Jan 2023
Information | VOL. 14

A survey on membership inference attacks and defenses in machine learning
Jun Niu ... Yuqing Zhang
Journal of Information and Intelligence | VOL. 2
Jun Niu, et. al.Jun Niu ... Yuqing Zhang
01 Mar 2024
Journal of Information and Intelligence | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Privacy Risk Assessment of Training Data in Machine Learning

Abstract

Talk to us

Similar Papers