Privacy protection of medical data in social network

Jie Su,Yi Cao,Yahui Liu,Yuehui Chen,Jinming Song

doi:10.1186/s12911-021-01645-0

Abstract

BackgroundProtection of privacy data published in the health care field is an important research field. The Health Insurance Portability and Accountability Act (HIPAA) in the USA is the current legislation for privacy protection. However, the Institute of Medicine Committee on Health Research and the Privacy of Health Information recently concluded that HIPAA cannot adequately safeguard the privacy, while at the same time researchers cannot use the medical data for effective researches. Therefore, more effective privacy protection methods are urgently needed to ensure the security of released medical data.MethodsPrivacy protection methods based on clustering are the methods and algorithms to ensure that the published data remains useful and protected. In this paper, we first analyzed the importance of the key attributes of medical data in the social network. According to the attribute function and the main objective of privacy protection, the attribute information was divided into three categories. We then proposed an algorithm based on greedy clustering to group the data points according to the attributes and the connective information of the nodes in the published social network. Finally, we analyzed the loss of information during the procedure of clustering, and evaluated the proposed approach with respect to classification accuracy and information loss rates on a medical dataset.ResultsThe associated social network of a medical dataset was analyzed for privacy preservation. We evaluated the values of generalization loss and structure loss for different values of k and a, i.e. k = {3, 6, 9, 12, 15, 18, 21, 24, 27, 30}, a = {0, 0.2, 0.4, 0.6, 0.8, 1}. The experimental results in our proposed approach showed that the generalization loss approached optimal when a = 1 and k = 21, and structure loss approached optimal when a = 0.4 and k = 3.ConclusionWe showed the importance of the attributes and the structure of the released health data in privacy preservation. Our method achieved better results of privacy preservation in social network by optimizing generalization loss and structure loss. The proposed method to evaluate loss obtained a balance between the data availability and the risk of privacy leakage.

Highlights

Protection of privacy data published in the health care field is an important research field
To prevent attacks on network structure, we provided a k-anonymous greedy clustering algorithm based on entities attributes of released social network
The key attributes of medical data in social network When the medical data is released, each dataset contains a plurality of tuples, and each tuple corresponds to a specific individual member in the society

Summary

Introduction

Protection of privacy data published in the health care field is an important research field. The Institute of Medicine Committee on Health Research and the Privacy of Health Information recently concluded that HIPAA cannot adequately safeguard the privacy, while at the same time researchers cannot use the medical data for effective researches. More effective privacy protection methods are urgently needed to ensure the security of released medical data. With the rapid increase in data volume and development of storage cloud platforms, the security of medical data is facing increasing challenges. This is because of the rise of mobile medical industry and the necessary information shared between commercial health insurance information systems, basic medical insurance information systems, and the medical institution information systems. Privacy protection is a very important consideration in the field of medical data sharing and distribution

Methods

Results

Discussion

Conclusion