Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms

Boudheb Tarik,Elberrichi Zakaria

doi:10.4018/ijdst.2019070103

Abstract

Due to the growing success of machine learning in the healthcare domain, medical institutions are striving to share their patients' data in the intention to build more accurate models which will be used to make better decisions. However, due to the privacy of the data, they are reluctant. To build the best models, they have to make the best feature selection for horizontally distributed private biomedical data. The previous proposed solutions are based on data perturbation techniques with the loss of performance. In this article, the researchers propose an original solution without perturbation. This is so the data utility is preserved and therefore the performance. The proposed solution uses a genetic algorithm, a distributed Naïve Bayes classifier, and a trusted third-party. The results obtained by the proposed approach surpass those obtained by other researchers, for the same problem.

Full Text