Privacy Preserving Data Mining Framework for Negative Association Rules: An Application to Healthcare Informatics

Saad M Darwish,Reham M Essa,Mohamed A Osman,Ahmed A Ismail

doi:10.1109/access.2022.3192447

Saad M Darwish, Reham M Essa + Show 2 more

Open Access

https://doi.org/10.1109/access.2022.3192447

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 13	License type: CC BY 4.0

Affiliation: Information Technology Institute

Abstract

Protecting the privacy of healthcare information is an important part of encouraging data custodians to give accurate records so that mining may proceed with confidence. The application of association rule mining in healthcare data has been widespread to this point in time. Most applications focus on positive association rules, ignoring the negative consequences of particular diagnostic techniques. When it comes to bridging divergent diseases and drugs, negative association rules may give more helpful information than positive ones. This is especially true when it comes to physicians and social organizations (e.g., a certain symptom will not arise when certain symptoms exist). Data mining in healthcare must be done in a way that protects the identity of patients, especially when dealing with sensitive information. However, revealing this information puts it at risk of attack. Healthcare data privacy protection has lately been addressed by technologies that disrupt data (data sanitization) and reconstruct aggregate distributions in the interest of doing research in data mining. In this study, metaheuristic-based data sanitization for healthcare data mining is investigated in order to keep patient privacy protected. It is hoped that by using the Tabu-genetic algorithm as an optimization tool, the suggested technique chooses item sets to be sanitized (modified) from transactions that satisfy sensitive negative criteria with the goal of minimizing changes to the original database. Experiments with benchmark healthcare datasets show that the suggested privacy preserving data mining (PPDM) method outperforms existing algorithms in terms of Hiding Failure (HF), Artificial Rule Generation (AR), and Lost Rules (LR).

Full Text