Abstract

Problems related to data privacy are studied in the areas of privacy preserving data mining (PPDM) and statistical disclosure control (SDC). Their goal is to avoid the disclosure of sensitive or proprietary information to third parties. In this paper a new synthetic data generation method is proposed and the information loss and disclosure risk are measured. The method is based on fuzzy techniques. Informally, a fuzzy c-regression method is applied to the original data set and synthetic data is released with an appropriate information loss and disclosure risk depending on c. As other data protection methods do, our synthetic data generation procedure allows third parties to do some statistical computations with a limited risk of disclosure. The trade-off between data utility and data safety of our proposed method will be assessed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call