Abstract
In this paper, we consider a Controlled Tabular Adjustment (CTA) model for statistical disclosure limitation of tabular data. The goal of the CTA model is to find the closest safe (masked) table to the original table that contains sensitive information. The measure of closeness is usually measured using ℓ 1 or ℓ 2 norm. However, in the norm-based CTA model, there is no control of how well the statistical properties of the data in the original table are preserved in the masked table. Hence, we propose a different criterion of "closeness" between the masked and original table which attempts to minimally change certain statistics used in the analysis of the table. The Chi-square statistic is among the most utilized measures for the analysis of data in two-dimensional tables. Hence, we propose a Chi-square CTA model which minimizes the objective function that depends on the difference of the Chi-square statistics of the original and masked table. The model is non-linear and non-convex and therefore harder to solve which prompted us to also consider a modification of this model which can be transformed into a linear programming model that can be solved more efficiently. We present numerical results for the two-dimensional table illustrating our novel approach and providing a comparison with norm-based CTA models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Privacy in statistical databases. PSD (Conference : 2004- )
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.