Abstract

Abstract Most typical data mining techniques are developed based on training the batch data which makes the task of mining the data stream represent a significant challenge. On the other hand, providing a mechanism to perform data mining operations without revealing the patient’s identity has increasing importance in the data mining field. In this work, a classification model with differential privacy is proposed for mining the medical data stream using Adaptive Random Forest (ARF). The experimental results of applying the proposed model on four medical datasets show that ARF mostly has a more stable performance over the other six techniques.

Highlights

  • A series of researches and projects in medical science, and information technology (IT) are starting a relationship between the healthcare industry and the IT industry that rapidly leads to a better and interactive relation among patients, their doctors, and health institutions

  • Zhang et al [22] used two mechanisms of noise: Laplace and exponential for providing privacy. They utilized lower noise sensitivity to avoid a high impact on split point choosing. They applied the proposed model on only one dataset, and the results showed more stability in classification accuracy compared with three other algorithms

  • Stream data faces many constraints as follow: (1) infinite arrival of data samples make storing them impossible, (2) the fast arrival of data samples requires dealing with each sample in real-time, (3) the possibility of changing items’ distribution overtime in which the old data would be useless for the current status

Read more

Summary

Introduction

A series of researches and projects in medical science, and information technology (IT) are starting a relationship between the healthcare industry and the IT industry that rapidly leads to a better and interactive relation among patients, their doctors, and health institutions. Computing Classification System 1998: H.2.8, I.2.1 Mathematics Subject Classification 2010: 68P25, 97R40 Key words and phrases: ensemble methods, bagging, privacy-preserving protocol. One of the most remarkable challenges facing data mining is privacy preservation. Privacy is an important component of medical data processing, as many health institutions refrain from providing this data to the public, due to the fear of compromising patient privacy. Providing a mechanism to carry out data mining operations, without revealing the patient’s identity has recently taken place in the interest of researchers

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call