Abstract
The Resource Description Framework (RDF), which has become the standard data model to represent web data, is facing the exploding size of the web resources leading to difficulties in terms of maintaining and querying the data. Distributed RDF triple stores and their storage layers were already under research for a decade. While multiple systems tried to employ the workload to guide partitioning and replicating the data set, they are not able to find optimal levels for both the replication and local index storage as well as the main memory cached indexes. In this paper we propose our novel unified optimization approach that enables a distributed RDF triple store to adapt its RDF storage layer in two aspects: the first aspect considers replication indexes, while the second aspect considers secondary and main memory indexes. Our system can dynamically analyze the workload, detect its queries trends, measure their effectiveness and apply them in triples' benefit functions. The system uses those functions to make fully automated decisions by either horizontally expanding each node's secondary storage by replication, or by vertically building more indexes. In the same context the system makes horizontal or vertical decisions about working nodes' main memory. The final objective of the optimization process is to decrease future query execution time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.