Abstract

Data privacy in e-health deals with the protection of sensitive medical information that is collected, stored, and analyzed in electronic health systems. Several organizations publish sensitive person-specific data for research purposes. E-health data and related domains are the loci of research. First, in publishing sensitive person-specific datasets, ensuring the privacy of user sensitive information is an issue. Secondly, ascertaining both privacy-preservation and data utility simultaneously are contradictory to each other. In addition, all transactions have the same prior belief that may result in erroneous modeling and privacy breaches. To refrain from the belief of an adversary and to solve the above discussed issues, a semantic privacy guarantee must be ensured before publishing data by any organization. This paper proposes a solution to the former issue, a framework for privacy preservation of structured datasets in ascertaining that an adversary has low confidence in extrapolation. The latter problem is also tackled by the proposed framework that combines stratified sampling with generalization to achieve representative semantic privacy-preservation with high data utility. Moreover, this study presents a mathematical proof that the proposed framework achieves differential privacy. Our experimental results show that our algorithm provides better data utility and privacy simultaneously. The proposed framework achieves 3% and 0.04% higher classification accuracy and low relative error, respectively, compared to state-of-the-art existing privacy-preservation approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call