Abstract

Data mining techniques are applied to identify hidden patterns in large amounts of patient data. These patterns can assist physicians in making more accurate diagnosis. For different physical conditions of patients, the same physiological index corresponds to a different symptom association probability for each patient. Data mining technologies based on certain data cannot be directly applied to these patients’ data. Patient data are sensitive data. An adversary with sufficient background information can make use of the patterns mined from uncertain medical data to obtain the sensitive information of patients. In this paper, a new algorithm is presented to determine the top K most frequent itemsets from uncertain medical data and to protect data privacy. Based on traditional algorithms for mining frequent itemsets from uncertain data, our algorithm applies sparse vector algorithm and the Laplace mechanism to ensure differential privacy for the top K most frequent itemsets for uncertain medical data and the expected supports of these frequent itemsets. We prove that our algorithm can guarantee differential privacy in theory. Moreover, we carry out experiments with four real-world scenario datasets and two synthetic datasets. The experimental results demonstrate the performance of our algorithm.

Highlights

  • The Internet of Things (IoT) involves a lot of different base technologies, such as wireless sensors, data management, and cloud computing [1]

  • (ii) Through privacy analysis, we prove that U-PrivMining guarantees differential privacy in theory

  • In order to prove that U-PrivMining guarantees differential privacy, we introduce the notions of count query set and threshold query set

Read more

Summary

Introduction

The Internet of Things (IoT) involves a lot of different base technologies, such as wireless sensors, data management, and cloud computing [1]. Medical personnel can utilize IoT technology to collect large amounts of patient data that can assist them in providing better medical services to patients [5, 6]. Traditional algorithms for mining frequent itemsets from medical data are based on certain data [7] and can be applied to discover hidden symptom patterns from a huge amount of data on patient symptoms. These patterns can be used by health managers to provide better healthcare for users [8]. Traditional algorithms for mining frequent itemsets from certain data cannot be directly applied to patient data

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call