HE-Friendly Algorithm for Privacy-Preserving SVM Training

Saerom Park,Joohee Lee,Jung Hee Cheon,Jaewook Lee,Junyoung Byun

doi:10.1109/access.2020.2981818

Abstract

Support vector machine (SVM) is one of the most popular machine learning algorithms. It predicts a pre-defined output variable in real-world applications. Machine learning on encrypted data is becoming more and more important to protect both model information and data against various adversaries. While some studies have been proposed on inference or prediction phases, few have been reported on the training phase. Homomorphic encryption (HE) for the arithmetic of approximate numbers scheme enables efficient arithmetic evaluations of encrypted data of real numbers, which encourages to develop privacy-preserving machine learning training algorithm. In this study, we propose an HE-friendly algorithm for the SVM training phase which avoids inefficient operations and numerical instability on an encrypted domain. The inference phase is also implemented on the encrypted domain with fully-homomorphic encryption which enables real-time prediction. Our experiment showed that our HE-friendly algorithm outperformed the state-of-the-art logistic regression classifier with fully homomorphic encryption on toy and real-world datasets. To the best of our knowledge, this study is the first practical algorithm for training an SVM model with fully homomorphic encryption. Therefore, our result supports the development of practical applications of the privacy-preserving SVM model.

Highlights

Machine learning has gained considerable attention recently, because of its usefulness in many big data analytic tasks involving artificial intelligence such as marketing, healthcare, and financial services
We propose a privacy-preserving training algorithm for the Support Vector Machine (SVM) model with Fully Homomorphic Encryption (FHE) for the first time, which is based on a gradient descent for the least squares problem
We aim to develop a scalable secure SVM training algorithm based on the HEAAN scheme

Summary

Introduction

Machine learning has gained considerable attention recently, because of its usefulness in many big data analytic tasks involving artificial intelligence such as marketing, healthcare, and financial services. There are increasing demands for services using machine learning algorithms. This is generating a definite need for new and increasingly effective privacy-preserving technologies. Support Vector Machine (SVM) is one of the most popular methods of classifying data. Despite the explosive popularity of deep learning, the SVM model is still crucial because deep learning algorithms require a lot of data, and the kernel methods work well on medium-sized data. The training phase of the SVM model must solve the convex optimization problem, while for deep learning models it must solve the non-convex optimization problem. To apply SVM models in real-world scenarios, the model parameters and training data need to be protected to preserve secrecy

Objectives

Methods

Discussion

Conclusion