Abstract

Continuing success of research on social and computer networks requires open access to realistic measurement datasets. While these datasets can be shared, generally in the form of social or Internet graphs, doing so often risks exposing sensitive user data to the public. Unfortunately, current techniques to improve privacy on graphs only target specific attacks, and have been proven to be vulnerable against powerful de-anonymization attacks.Our work seeks a solution to share meaningful graph datasets while preserving privacy. We observe a clear tension between strength of privacy protection and maintaining structural similarity to the original graph. To navigate the tradeoff, we develop a differentially-private graph model we call Pygmalion. Given a graph G and a desired level of e-differential privacy guarantee, Pygmalion extracts a graph's detailed structure into degree correlation statistics, introduces noise into the resulting dataset, and generates a synthetic graph G'. G' maintains as much structural similarity to G as possible, while introducing enough differences to provide the desired privacy guarantee. We show that simply applying differential privacy to graphs results in the addition of significant noise that may disrupt graph structure, making it unsuitable for experimental study. Instead, we introduce a partitioning approach that provides identical privacy guarantees using much less noise. Applied to real graphs, this technique requires an order of magnitude less noise for the same privacy guarantees. Finally, we apply our graph model to Internet, web, and Facebook social graphs, and show that it produces synthetic graphs that closely match the originals in both graph structure metrics and behavior in application-level tests.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.