Abstract
Abstract Most typical data mining techniques are developed based on training the batch data which makes the task of mining the data stream represent a significant challenge. On the other hand, providing a mechanism to perform data mining operations without revealing the patient’s identity has increasing importance in the data mining field. In this work, a classification model with differential privacy is proposed for mining the medical data stream using Adaptive Random Forest (ARF). The experimental results of applying the proposed model on four medical datasets show that ARF mostly has a more stable performance over the other six techniques.
Highlights
A series of researches and projects in medical science, and information technology (IT) are starting a relationship between the healthcare industry and the IT industry that rapidly leads to a better and interactive relation among patients, their doctors, and health institutions
Zhang et al [22] used two mechanisms of noise: Laplace and exponential for providing privacy. They utilized lower noise sensitivity to avoid a high impact on split point choosing. They applied the proposed model on only one dataset, and the results showed more stability in classification accuracy compared with three other algorithms
Stream data faces many constraints as follow: (1) infinite arrival of data samples make storing them impossible, (2) the fast arrival of data samples requires dealing with each sample in real-time, (3) the possibility of changing items’ distribution overtime in which the old data would be useless for the current status
Summary
A series of researches and projects in medical science, and information technology (IT) are starting a relationship between the healthcare industry and the IT industry that rapidly leads to a better and interactive relation among patients, their doctors, and health institutions. Computing Classification System 1998: H.2.8, I.2.1 Mathematics Subject Classification 2010: 68P25, 97R40 Key words and phrases: ensemble methods, bagging, privacy-preserving protocol. One of the most remarkable challenges facing data mining is privacy preservation. Privacy is an important component of medical data processing, as many health institutions refrain from providing this data to the public, due to the fear of compromising patient privacy. Providing a mechanism to carry out data mining operations, without revealing the patient’s identity has recently taken place in the interest of researchers
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.