Secure Collaborative Computing for Linear Regression

Albert Guan,Chun-Hung Lin,Po-Wen Chi

doi:10.3390/app14010227

Abstract

Machine learning usually requires a large amount of training data to build useful models. We exploit the mathematical structure of linear regression to develop a secure and privacy-preserving method that allows multiple parties to collaboratively compute optimal model parameters without requiring the parties to share their raw data. The new approach also allows for efficient deletion of the data of users who want to leave the group and who wish to have their data deleted. Since the data remain confidential during both the learning and unlearning processes, data owners are more inclined to share the datasets they collect to improve the models, ultimately benefiting all participants. The proposed collaborative computation of linear regression models does not require a trusted third party, thereby avoiding the difficulty of building a robust trust system in the current Internet environment. The proposed scheme does not require encryption to keep the data secret, nor does it require the use of transformations to hide the real data. Instead, our scheme sends only the aggregated data to build a collaborative learning scheme. This makes the scheme more computationally efficient. Currently, almost all homomorphic encryption schemes that support both addition and multiplication operations demand significant computational resources and can only offer computational security. We prove that a malicious party lacks sufficient information to deduce the precise values of another party’s original data, thereby preserving the privacy and security of the data exchanges. We also show that the new linear regression learning scheme can be updated incrementally. New datasets can be easily incorporated into the system, and specific data can be removed to refine the linear regression model without the need to recompute from the beginning.

Full Text