Dataset In Feature Space Research Articles

To further enhance the reliability of Machine Learning (ML) systems, considerable efforts have been dedicated to developing privacy protection techniques. Recently, membership privacy has gained increasing attention, with a focus on determining whether a specific data point is present in the confidential training set of an ML model. However, most current attacks only prioritize attack accuracy and fail to extend their range to the evidence that contributes to the member/non-member classification. This limitation greatly reduces the practicality of Membership Inference Attacks (MIA), as real-world data typically includes multiple features, making it challenging to identify which features are involved in the sensitive training set. Therefore, this paper targets one of the fundamental challenges in membership inference attack: measuring the distance between an attack sample and a member sample. Specifically, we propose a novel threat model called Membership Reconstruction Attack (MRA), which aims to reconstruct the exact distribution of the target training set. MRA achieves this by marking each input dimension (e.g., pixels) according to its similarity to the target dataset in feature space. Our attack demonstrates its effectiveness across various settings, including different major datasets (MNIST, CIFAR-10, CIFAR-100) and different model architectures (AlexNet, ResNet, DenseNet, and generative models). Additionally, we evaluate MRA from the defenders' perspective and test several defense approaches against our attack.

Read full abstract

Although Reynolds-Averaged Navier-Stokes (RANS) equations are still the dominant tool for engineering design and analysis applications involving turbulent flows, standard RANS models are known to be unreliable in many flows of engineering relevance, including flows with separation, strong pressure gradients or mean flow curvature. With increasing amounts of 3-dimensional experimental data and high fidelity simulation data from Large Eddy Simulation (LES) and Direct Numerical Simulation (DNS), data-driven turbulence modeling has become a promising approach to increase the predictive capability of RANS simulations. Recently, a data-driven turbulence modeling approach via machine learning has been proposed to predict the Reynolds stress anisotropy of a given flow based on high fidelity data from closely related flows. In this work, the closeness of different flows is investigated to assess the prediction confidence a priori. Specifically, the Mahalanobis distance and the kernel density estimation (KDE) technique are used as metrics to quantify the distance between flow data sets in feature space. The flow over periodic hills at Re=10595 is used as the test set and seven flows with different configurations are individually used as training set. The results show that the prediction error of the Reynolds stress anisotropy is positively correlated with Mahalanobis distance and KDE distance, demonstrating that both extrapolation metrics can be used to estimate the prediction confidence a priori. A quantitative comparison using correlation coefficients shows that the Mahalanobis distance is less accurate in estimating the prediction confidence than KDE distance. The extrapolation metrics introduced in this work and the corresponding analysis provide an approach to aid in the choice of the data source and to assess the prediction confidence for data-driven turbulence modeling.

Read full abstract

Dataset In Feature Space Research Articles

Articles published on Dataset In Feature Space

A Data Fusion Framework for Multi-Domain Morality Learning

Membership reconstruction attack in deep neural networks

A Priori Assessment of Prediction Confidence for Data-Driven Turbulence Modeling

On linear separability of data sets in feature space

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dataset In Feature Space Research Articles

Articles published on Dataset In Feature Space

A Data Fusion Framework for Multi-Domain Morality Learning

Membership reconstruction attack in deep neural networks

A Priori Assessment of Prediction Confidence for Data-Driven Turbulence Modeling

On linear separability of data sets in feature space