Abstract
The CODATA Data Science Journal is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data and databases across all research domains, including science, technology, the humanities and the arts. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for open data.All data is in scope, whether born digital or converted from other sources.
Highlights
Data mining or knowledge discovery in databases (KDD), which focuses on the extraction of useful knowledge from large amount of data, has steadily attracted researchers and practitioners from various fields
Privacy-preservation is an important issue in medical data mining
This paper investigates data separation techniques in medical data classification
Summary
Data mining or knowledge discovery in databases (KDD), which focuses on the extraction of useful knowledge from large amount of data, has steadily attracted researchers and practitioners from various fields. As early as 1989, when the first KDD workshop was held in Detroit, Michigan, privacy issues have been brought up This is an especially important issue in medical data mining. The objective of this paper is to apply data separation-based techniques to preserve privacy in classification of medical data. In the vertical partition approach, each site uses a portion of the attributes to compute its results, and the distributed results are assembled at a central trusted party using majority-vote ensemble method. Each site computes its own data, and a central trusted party is responsible to integrate these results We implement these two approaches using two medical datasets from UCI Machine Learning repository: Wisconsin prognostic breast cancer dataset and heart-disease dataset. The section explains why and how we use vertical and horizontal separation techniques to protect privacy of medical data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.