Abstract

There is a need for synthetic graphs to help benchmarking efforts. Synthetic graphs that mimic real-world graphs can be used to avoid sending sensitive information to third parties while preserving topological characteristics of the input original graph. They can also be used to evaluate the scalability of different algorithms since the size of synthetic graphs can be scaled. In view of these applications, we introduce a novel approach to mimik RDF graphs. Our approach introduces a random rotation in the tensor factorization of the input RDF graph. By combining this matrix with the core tensor computed by the factorization, our approach can generate a graph which maintains the querying characteristics of the input graph, while not permitting a reconstruction of the input graph. We use Semantic Web Dog Food and DBpedia 2016 to evaluate our approach and compare the original, reconstructed and synthetic graphs by using them to benchmark five triple stores. The results show that the Pearson correlation between the performance of the triple stores under original and synthetic graphs is 0.91, 0.64 for Semantic Web Dog Food and DBpedia respectively. Our results also suggest that the synthetic graphs inherit the main graph characteristics of the original graphs. SynthG is open-source and is available at: https://github.com/dice-group/SynthG.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call