Abstract

Since its launching as the standard language of the semantic web, the Resource Description Framework RDF has gained an enormous importance in many fields. This has led to the appearance of a variety of data systems to store and process RDF data. To help users identify the best suited RDF data stores for their needs, we establish a list of evaluation and comparison criteria of existing RDF management systems also called triplestores. This is the first work addressing such topic for such triplestores. The criteria list highlights various aspects and is not limited to special stores but covers all types of stores including among others relational, native, centralized, distributed and big data stores. Furthermore, this criteria list is established taking into account relevant issues in accordance with triplestores tasks with respect to the main issues of RDF data storage, RDF data processing, performance, distribution and ease of use. As a study case we consider an application of the evaluation criteria to the graph RDF triplestore AllegroGraph.

Highlights

  • The primary goal of the W3C (World Wide Web Consortium) standardized ontology language RDF (Resource Description Framework, [24]) and its query language SPARQL (SPARQL Protocol and RDF Query Language, [25]) is to enrich the Web with semantics by structuring data through linking

  • To come up with solutions to the relational problems with regards to RDF data handling, various RDF data management systems have been proposed during the past decade ranging from NoSQL ( SQL) based systems through native triplestores to Big Data solutions

  • Good strategies are to be provided by triplestores to select the RDF data to be replicated, to control the storage availability and to handle data changes related to updates, insertion or deletion

Read more

Summary

INTRODUCTION

The primary goal of the W3C (World Wide Web Consortium) standardized ontology language RDF (Resource Description Framework, [24]) and its query language SPARQL (SPARQL Protocol and RDF Query Language, [25]) is to enrich the Web with semantics by structuring data through linking. The aim of this work is to give a complete list of evaluation and comparison criteria for RDF management systems. To this end, we first give a summarized categorization of existing triplestores while considering the motivations behind their use for handling RDF data. With the established criteria list, we aim to provide users with detailed insights of the various RDF management systems and comparison aspects with regards to the various relevant issues of dealing with RDF data.

SEMANTIC WEB STANDARDS AND RELATED WORK
RDF and SPARQL
Schema Languages RDFS and OWL
RDF Triplestores
CRITERIA RELATED TO RDF DATA STORAGE
Compliance with RDF Data Model
RDF Data Validation
Storage Capacity
Data Portability and Serialization
Integration of Other Data Sources
Support for SPARQL Constructs
Data Retrieval and Modification Time Costs
Indexing
Reasoning
Support for ACID Properties
Query Optimization
Support for Programming Languages
Support for BI
Streaming Capabilities
Crash Recovery
Data Replication and Partitioning
Scalability
Data Visualization and User APIs
VIII. CASE STUDY
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call