Abstract

This study evaluates the performance of privacy models and ensemble classification algorithms for data anonymization on classification. Data mining is continuously used in various purposes to extract knowledge. It is necessary for us to concern about privacy to prevent the result from disclosing identity of persons. Data anonymization has emerged with the objective of reducing re-identification risk. However, when data anonymization is applied, the data utility may decrease. Therefore, it is necessary to trade-off between privacy risks and the data utility. Our objectives in this research are to evaluate the effects of data classification with anonymized data and to evaluate the performance of various privacy models and ensemble classification algorithms. The measurement metrics in this experiment are accuracy, re-identification risk and suppressed records. Our experimental results show that there is no significant difference between the accuracy of classification using original data and the accuracy of classification using anonymized data. In addition, the average accuracy of each algorithm is not significantly different.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.