Abstract

An optimal approach to anonymization using small data is proposed in this study. Map Reduce is a big data processing framework used across distributed applications. Prior to the development of a map reduce framework, data are distributed and clustered using a hybrid clustering algorithm. The algorithm used for grouping together similar techniques utilises the k-means clustering algorithm, along with the MFCM clustering algorithm. Clustered data is then fed into the map reduce frame work after it has been clustered. In order to guarantee privacy, the optimal k anonymization method is recommended. When using generalisation and randomization, there are two techniques that can be employed: K-anonymity, which is unique to each, depends on the type of the quasi identifier attribute. Our method replaces the standard k anonymization process by employing an optimization algorithm that dynamically determines the optimal k value. This algorithm uses the Modified Grey Wolf Optimization (MGWO) algorithm for optimization. The memory, execution time, accuracy, and error value are used to assess the recommended method’s practise. This experiment has shown that the suggested method will always finish ahead of the existing method by using the least amount of time while ensuring the greatest level of security. The current technique gets the lowest accuracy and the privacy proposed achieves the maximum accuracy while compared to the current technique. The solution is implemented in Java with Hadoop Map-Reduce, and it is tested and deployed in the cloud on Google Cloud Platform.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call