Abstract

This research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the genetic operation “crossover” is performed until the convergence condition is satisfied. The recursion will be terminated if the size of the generated subset is satisfied. Eventually, the genetic operation “mutation” will be performed over all generated subsets that satisfied the variable group size constraint in order to maximize the objective function. Experimentally, the proposed micro-aggregation technique was applied to recommended real-life data sets. Results demonstrated a remarkable reduction in the computational time, which sometimes exceeded 70% compared to the state-of-the-art. Furthermore, a good equilibrium value of the Scoring Index (SI) was achieved by involving a linear combination of the General Information Loss (GIL) and the General Disclosure Risk (GDR).

Highlights

  • A large number of users, clients, and customers access data and information, which raises concerns regarding the confidentiality of data [1,2]

  • The strength of the newly developed Recursive Genetic Micro-Aggregation Technique (RGMAT) is profound when it is used with a variable group size constraint

  • The RGMAT has the talent of implementing a recursive division of the whole data set into two groups/chromosomes based on the distance proximity between the individual micro-records/genes

Read more

Summary

Introduction

A large number of users, clients, and customers access data and information, which raises concerns regarding the confidentiality of data [1,2]. Accessing statistical summaries is obligatory in several public and private entities [3,4], threatening data security and privacy. Several statistical agencies worldwide aim to provide useful statistical summaries without breaking the confidentiality requirements. Assessment of the confidentiality and utility of the data is studied using various methods and strategies [4]. “Micro-Aggregation” is a perturbative method that critically partitions the micro-data file into groups of either a fixed-size k or variable-size k ≤ size ≤ 2k − 1, where k is a predefined threshold set by the data protector [4]. If the size of the group is satisfiable, Micro-Aggregation Technique ( MAT ) discloses the mean values of the group as a replacement of the original micro-records

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call