Privacy-Preserving Non-Negative Matrix Factorization with Outliers

Hafiz Imtiaz,Swapnil Saha

doi:10.1145/3632961

Abstract

Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from inherently non-negative data. Such data often contain privacy-sensitive user information. Additionally, the dataset can contain outliers, which may lead to extracting sub-optimal features from the data. It is, therefore, necessary to address these two issues while analyzing privacy-sensitive data that may contain outliers. In this work, we develop a non-negative matrix factorization algorithm in the privacy-preserving framework that (i) considers the presence of outliers in the data, and (ii) can achieve results comparable to those of the non-private algorithm. We design our method in such a way that one has the control to select the degree of privacy grantee based on the required utility gap. We show the effectiveness of our proposed algorithm’s performance on six real and diverse datasets. The experimental results show that our proposed method can achieve a performance that closely approximates the performance of the non-private algorithm under some parameter choices, while ensuring strict privacy guarantees.

Full Text