Abstract

In various applications such as trajectory tracking in mobile social networks and online recommendation systems, the massive raw data are often incomplete due to various unpredictable or unavoidable reasons. Matrix completion algorithms are effective for reconstructing two-dimensional data, but sending raw data containing personal, sensitive information to cloud computing nodes for matrix completion may lead to privacy exposure issue. The homomorphic matrix completion is a promising approach to perform matrix completion while preserving privacy. However, CPU-based homomorphic matrix completion has low performance, making it impractical to process multiple or large-scale data completion tasks in real-time. In this paper, we propose a high-performance homomorphic matrix completion scheme by exploiting commodity GPUs (Graphics Processing Units) that are widely available in HPC servers and cloud computing nodes. First, we design and implement a baseline GPU-based homomorphic matrix completion, and propose techniques to optimize memory accesses, GPU utilization, and communications. Second, we propose a shard mode for large-scale matrices exceeding GPU memory capacity. Third, we propose a multi-GPU mode to fully utilize multiple GPUs in computing nodes. Experiment results show that the proposed scheme is both fast and accurate. On matrices of varying sizes, the proposed scheme running on a single Tesla V100 GPU achieves up to 116.23× speedups over the CPU MATLAB implementation running on dual Xeon CPUs. The multi-GPU mode achieves up to 1.84× speedups on two GPUs versus on a single GPU. For large-scale matrices, the shard mode achieves up to 174.92× speedups on a single GPU over the CPU MATLAB implementation on two CPUs, and further achieves up to 1.35× speedups when running on two GPUs using the multi-GPU mode.

Highlights

  • In many applications such as video and image processing [2], out-door and in-door localization [3], [4], and recommendation systems [5], the massive data are often incomplete owing to various reasons in data acquisition and transmission [6]

  • The proposed scheme achieves up to 174.92× speedups on a Tesla V100 GPU versus the CPU MATLAB implementation [19] running on dual Xeon CPUs for large matrices, and further achieves up to 1.35× speedups when running on two GPUs using the multi-GPU mode

  • OVERVIEW OF THE HOMOMORPHIC MATRIX COMPLETION ALGORITHM The homomorphic matrix completion algorithm [19] consists of three steps, as shown in Fig. 2: 1) Encryption of raw data on user devices as in Fig. 2(a): each user uses the public matrix P ∈ RI×K from cloud computing nodes to encrypt her data vector on user devices such as mobile phones

Read more

Summary

INTRODUCTION

In many applications such as video and image processing [2], out-door and in-door localization [3], [4], and recommendation systems [5], the massive data are often incomplete owing to various reasons in data acquisition and transmission [6]. Plain-data (i.e., not encrypted) based matrix completion on cloud computing nodes may have privacy issues. We design, implement and optimize a GPU-based homomorphic matrix completion scheme to achieve high performance and accuracy. We propose a multi-GPU mode to fully utilize multiple GPUs in cloud computing nodes. This mode achieves up to 1.84× speedups on two GPUs versus on a single GPU.

NOTATIONS
1: Initialize
PARALLEL ACCELERATION ANALYSIS
OPTIMIZATIONS
MULTI-GPU HOMOMORPHIC MATRIX COMPLETION
LARGE-SCALE HOMOMORPHIC MATRIX COMPLETION ON MULTIPLE GPUs
RELATED WORKS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call