Abstract

Data mining is the mining of formerly not known and valid information from the archived data of organizations. The datasets are mostly high dimensional which will make the data mining process difficult. Feature selection is the dimensionality reduction technique in data mining. Selection stability is the robustness of the feature selection algorithms for small perturbation of the dataset i.e., to select the same or similar subset of features in each subsequent iterations. Selection stability is mostly depending on the characteristics of the dataset. Privacy preserving data publishing techniques modify the dataset for preserving the privacy of the individuals and this perturbation will affect the selection stability. There will be correlation between the perturbations of the dataset for privacy preservation, feature selection stability and accuracy of the data mining results i.e., data utility. There will be various selection stability metrics to measure the selection stability. This paper analyses the privacy preserving data publishing techniques for these various feature selection stability measures on behalf of privacy preservation, selection stability and data utility.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.