Abstract

Privacy is very important in shared data for the knowledge based applications. However it causes serious privacy concerns, when the sensitive data is stored and moved to other applications. It is vital to incorporate privacy in the sensitive data for the data mining process. While preserving privacy, certain protocols allow the knowledge extraction from the modified data without revealing the original information. In this work, a series of steps like, Weight of Evidence, Information Value, Min–Max normalization and 3D shearing are applied to perturb the quasi-identifiers in the data. The classification techniques such as Decision Tree, Random Forest, Extreme Gradient Boost and Support Vector Machines are employed in adult income, bank marketing and lung cancer datasets to analyze the performance of the original and perturbed data. Accuracy, variance and sensitivity-specificity are being considered as performance measures of the classifiers. This research work is compared with 2D rotation and 3D rotation algorithms. The experimental results clearly show that the proposed work preserves the data utility with higher data transformation capacity and privacy preserving capacity than the existing geometric transformation techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.