Abstract

Recent efforts have shown that training data is not secured through the generalization and abstraction of algorithms. This vulnerability to the training data has been expressed through membership inference attacks that seek to discover the use of specific records within the training dataset of a model. Additionally, disparate membership inference attacks have been shown to achieve better accuracy compared with their macro attack counterparts. These disparate membership inference attacks use a pragmatic approach to attack individual, more vulnerable sub-sets of the data, such as underrepresented classes. While previous work in this field has explored model vulnerability to these attacks, this effort explores the vulnerability of datasets themselves to disparate membership inference attacks. This is accomplished through the development of a vulnerability-classification model that classifies datasets as vulnerable or secure to these attacks. To develop this model, a vulnerability-classification dataset is developed from over 100 datasets—including frequently cited datasets within the field. These datasets are described using a feature set of over 100 features and assigned labels developed from a combination of various modeling and attack strategies. By averaging the attack accuracy over 13 different modeling and attack strategies, the authors explore the vulnerabilities of the datasets themselves as opposed to a particular modeling or attack effort. The in-class observational distance, width ratio, and the proportion of discrete features are found to dominate the attributes defining dataset vulnerability to disparate membership inference attacks. These features are explored in deeper detail and used to develop exploratory methods for hardening these class-based sub-datasets against attacks showing preliminary mitigation success with combinations of feature reduction and class-balancing strategies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.