Abstract
Many organisations still consider preserving privacy for big data as a major challenge. Parallel computation can be used to optimise big data analysis. This paper gives a proposal for parallelising k-anonymisation algorithms through comparative study and survey. The k-anonymisation algorithms considered are MinGen, DataFly, Incognito and Mondrian. The result shows the parallel versions of the algorithms perform better than sequential counterparts, as data size increases. For small size dataset in sequential mode MinGen is 71.83% faster than parallel version. However, in sequential mode DataFly and in parallel mode incognito performed well. For large size dataset in parallel mode Incognito is 101.186% faster than sequential. However, in sequential mode MinGen and DataFly performed well. In parallel mode Incognito, DataFly and MinGen performed well. The paper acts as a single point of reference for choosing big data mining k-anonymisation algorithms. This paper gives direction of applying HPC concepts such as parallelisation for privacy preserving algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.