Abstract

Big graphs, such as user interactions in social networks and customer rating matrices in collaborative filters, possess great values for both businesses and research. They are not only big but often keep evolving, which requires a large amount of computing resources to maintain. With the wide deployment of public cloud resources, owners of big graphs may want to use cloud resources to obtain storage and computation scalability. However, privacy and ownership of the graphs in the cloud has become a major concern. In this paper, we study privacy-preserving algorithms for one of the important graph analysis techniques—graph spectral analysis for outsourced graph in the cloud. The core operation: eigendecomposition of large matrix is also important to many data mining algorithms. We consider a cloud-centric framework with three collaborative parties: data contributors, data owner, and cloud provider. Graphs are represented as matrices such as adjacency matrix and Laplacian matrix, the elements of which are encrypted and submitted by distributed contributors. The data owner then interacts with the cloud-side programs to conduct spectral analysis, while protecting data privacy from the honest-but-curious cloud provider. For a $N\times N$N×N graph matrix, we aim to design algorithms with the cloud handling expensive storage and computation in $O(N^2)$O(N2) complexity, while data owner and data contributors’ algorithms take only $O(N)$O(N). To achieve this goal, we develop the privacy-preserving versions of the two approximate eigendecomposition algorithms: the Lanczos algorithm and the Nystrom algorithm, considering two encryption methods: additive homomorphic encryption (AHE) methods and somewhat homomorphic encryption (SHE) methods. Both dense and sparse matrices are studied, while sparse matrices also involve a differentially private data submission protocol to allow the trade-off between data sparsity and privacy. Experimental results show that the Nystro algorithm with sparse encoding can dramatically reduce data owners’ costs. SHE-based methods have lower computational time while AHE-based methods have lower communication/storage costs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.