Abstract

The RDF (Resource Description Framework) datamodel has been used in various domains, such as Web,government, biology etc. Now, the volume of RDF datasets is growing significantly. The explosion on the volume of RDF data raises serious challenges: how to answer SPARQL queries on large RDF data sets efficiently. Here, we present a large-scale RDF data system - TripleParallel, which implements blockbased parallel processing SPARQL queries on RDF data sets with billion triples. The system improves parallelism while strengthening the overlapping data and calculations and reduces the overall execution time of the query. TripleParallel also implements multiple parallel operations for parallel processing joins. Experimental studies with several RDF datasets, including the LUBM and the UniProt collection, demonstrate the performance gains of our approach, outperforming the previous fastest system by more than an order of magnitude.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call