Abstract

Data mining is the mining of formerly not known and valid information from the archived data of organizations. The datasets are mostly high dimensional which will make the data mining process difficult. Feature selection is the dimensionality reduction technique in data mining. Selection stability is the robustness of the feature selection algorithms for small perturbation of the dataset i.e., to select the same or similar subset of features in each subsequent iterations. Selection stability is mostly depending on the characteristics of the dataset. Privacy preserving data publishing techniques modify the dataset for preserving the privacy of the individuals and this perturbation will affect the selection stability. There will be correlation between the perturbations of the dataset for privacy preservation, feature selection stability and accuracy of the data mining results i.e., data utility. There will be various selection stability metrics to measure the selection stability. This paper analyses the privacy preserving data publishing techniques for these various feature selection stability measures on behalf of privacy preservation, selection stability and data utility.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call