Online medical prediagnosis systems have already shown great achievement in providing the guidance of healthcare services with lower time and cost. Achieving a high-precision medical primary diagnosis system faces many severe challenges on the privacy of individual health information, the distributed storage of medical data and the diversity of the disease. In this paper, we propose an efficient and privacy-preserving framework for obtaining a pre-clinical guide model, which allows an authorized data analysis center to train a disease classifier using a combination of medical data gathered from different entities. Our proposed scheme is based on soft-margin support vector machine (SVM) which takes Taylor polynomial of exponential-loss as penalty. Our scheme achieves the following advantages: the trained model can tolerate some abnormal samples therefore has higher generalization ability, and the training process can constraint the inefficient operations in the encrypted domain thus leads to the availability of partial homomorphic encryption system. Lately, we prove that the proposed scheme achieves the goal of medical prediagnosis system construction and data without privacy leakage to data analysis center and model parameters without exposure to data providers, as well as demonstrating its utility and efficiency using real-world medical datasets.
Read full abstract