Abstract

Data privacy is a major concern when accessing and processing sensitive medical data. A promising approach among privacy-preserving techniques is homomorphic encryption (HE), which allows for computations to be performed on encrypted data. Currently, HE still faces practical limitations related to high computational complexity, noise accumulation, and sole applicability the at bit or small integer values level. We propose herein an encoding method that enables typical HE schemes to operate on real-valued numbers of arbitrary precision and size. The approach is evaluated on two real-world scenarios relying on EEG signals: seizure detection and prediction of predisposition to alcoholism. A supervised machine learning-based approach is formulated, and training is performed using a direct (non-iterative) fitting method that requires a fixed and deterministic number of steps. Experiments on synthetic data of varying size and complexity are performed to determine the impact on runtime and error accumulation. The computational time for training the models increases but remains manageable, while the inference time remains in the order of milliseconds. The prediction performance of the models operating on encoded and encrypted data is comparable to that of standard models operating on plaintext data.

Highlights

  • In recent years, artificial intelligence (AI) algorithms have shown great potential in several fields, including healthcare

  • All the experiments were simultaneously performed on plaintext data and on the encrypted equivalent to ensure that the same result was obtained

  • We proposed a method for performing privacy-preserving machine learning-based predictions on homomorphically encrypted medical data

Read more

Summary

Introduction

Artificial intelligence (AI) algorithms have shown great potential in several fields, including healthcare. Allowing for customized diagnosis, treatment planning and disease prevention, AI has demonstrated its ability to deliver personalized medicine [1]. To obtain a satisfactory performance, a significant amount of data needs to be collected, stored and processed. While achieving promising results in patient-specific medical applications, accessing sensitive data for training AI models requires proper anonymization [2]. Even at the inference phase, when an already trained AI model is employed, privacy should not be compromised. Regulations regarding personal data confidentiality (e.g., GDPR in the EU, HIPAA in the U.S.A.) emphasize the need for developing more effective privacy-preserving techniques

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call