Abstract

The sharing of sensitive personal data has become a core element of biomedical research. To protect privacy, a broad spectrum of techniques must be implemented, including data anonymization. In this article, we present ARX, an anonymization tool for structured data which supports a broad spectrum of methods for statistical disclosure control by providing (1) models for analyzing re-identification risks, (2) risk-based anonymization, (3) syntactic privacy criteria, such as k-anonymity, l-diversity, t-closeness and δ-presence, (4) methods for automated and manual evaluation of data utility, and (5) an intuitive coding model using generalization, suppression and microaggregation. ARX is highly scalable and allows for anonymizing datasets with several millions of records on commodity hardware. Moreover, it offers a comprehensive graphical user interface with wizards and visualizations that guide users through different aspects of the anonymization process. ARX is not just a toolbox, but a fully-fledged application, meaning that all implemented methods have been harmonized and integrated with each other. It is well understood that balancing privacy and data utility requires user feedback. To facilitate this interaction, ARX is highly configurable and provides various methods for exploring the solution space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call