Conditional Masking to Numerical Data

Debolina Ghatak,Bimal K. Roy

doi:10.1007/s42519-019-0042-y

Abstract

Protecting the privacy of datasets has become hugely important these days. Many real-life datasets like income data and medical data need to be secured before making it public. However, security comes at the cost of losing some useful statistical information about the dataset. Data obfuscation deals with this problem of masking a dataset in such a way that the utility of the data is maximized while minimizing the risk of the disclosure of sensitive information. Two popular approaches to data obfuscation for numerical data involve (i) data swapping and (ii) adding noise to data. While the former masks well sacrificing the whole of correlation information, the latter gives estimates for most of the popular statistics like mean, variance, quantiles and correlation but fails to give an unbiased estimate of the distribution curve of the original data. In this paper, we propose a mixed method of obfuscation combining the above two approaches and discuss how the proposed method succeeds in giving an unbiased estimation of the distribution curve while giving reliable estimates of the other well-known statistics like moments and correlation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Conditional Masking to Numerical Data

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Theory and Practice

Lead the way for us

Journal: Journal of Statistical Theory and Practice	Publication Date: May 16, 2019
Citations: 1

Similar Papers

SHARE: system design and case studies for statistical health information release
J Gardner ... X Jiang
Journal of the American Medical Informatics Association | VOL. 20
J Gardner, et. al.J Gardner ... X Jiang
11 Oct 2012
Journal of the American Medical Informatics Association | VOL. 20

Hybrid Technique for Medical Data Classification using Multi-Layer Perceptron with NB Classifier
Thalakola Syam Sundara Rao* ... Dr Bhanu Prakash Battula
International Journal of Innovative Technology and Exploring Engineering | VOL. 8
Thalakola Syam Sundara Rao*, et. al.Thalakola Syam Sundara Rao* ... Dr Bhanu Prakash Battula
30 Oct 2019
International Journal of Innovative Technology and Exploring Engineering | VOL. 8

Medical Big Data Statistical Management System Based on Decision Tree Algorithm
Yuhang Miao ... Hang Yu
-
Yuhang Miao, et. al.Yuhang Miao ... Hang Yu
03 Oct 2022
03 Oct 2022

Different Scales of Medical Data Classification Based on Machine Learning Techniques: A Comparative Study
Heba Aly Elzeheiry ... Amira Rezk
Applied Sciences | VOL. 12
Heba Aly Elzeheiry, et. al.Heba Aly Elzeheiry ... Amira Rezk
17 Jan 2022
Applied Sciences | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Conditional Masking to Numerical Data

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Theory and Practice