Abstract

The CODATA Data Science Journal is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data and databases across all research domains, including science, technology, the humanities and the arts. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for open data.All data is in scope, whether born digital or converted from other sources.

Highlights

  • Data mining or knowledge discovery in databases (KDD), which focuses on the extraction of useful knowledge from large amount of data, has steadily attracted researchers and practitioners from various fields

  • Privacy-preservation is an important issue in medical data mining

  • This paper investigates data separation techniques in medical data classification

Read more

Summary

INTRODUCTION

Data mining or knowledge discovery in databases (KDD), which focuses on the extraction of useful knowledge from large amount of data, has steadily attracted researchers and practitioners from various fields. As early as 1989, when the first KDD workshop was held in Detroit, Michigan, privacy issues have been brought up This is an especially important issue in medical data mining. The objective of this paper is to apply data separation-based techniques to preserve privacy in classification of medical data. In the vertical partition approach, each site uses a portion of the attributes to compute its results, and the distributed results are assembled at a central trusted party using majority-vote ensemble method. Each site computes its own data, and a central trusted party is responsible to integrate these results We implement these two approaches using two medical datasets from UCI Machine Learning repository: Wisconsin prognostic breast cancer dataset and heart-disease dataset. The section explains why and how we use vertical and horizontal separation techniques to protect privacy of medical data.

PRIVACY-PRESERVING MEDICAL DATA MINING
Horizontal Data Separation Experiment
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.