Resource Description Framework Storage Research Articles

Semantic interoperability for the Internet of Things (IoT) is enabled by standards and technologies from the Semantic Web. As recent research suggests a move towards decentralised IoT architectures, we have investigated the scalability and robustness of RDF (Resource Description Framework)engines that can be embedded throughout the architecture, in particular at edge nodes. RDF processing at the edge facilitates the deployment of semantic integration gateways closer to low-level devices. Our focus is on how to enable scalable and robust RDF engines that can operate on lightweight devices. In this paper, we have first carried out an empirical study of the scalability and behaviour of solutions for RDF data management on standard computing hardware that have been ported to run on lightweight devices at the network edge. The findings of our study shows that these RDF store solutions have several shortcomings on commodity ARM (Advanced RISC Machine) boards that are representative of IoT edge node hardware. Consequently, this has inspired us to introduce a lightweight RDF engine, which comprises an RDF storage and a SPARQL processor for lightweight edge devices, called RDF4Led. RDF4Led follows the RISC-style (Reduce Instruction Set Computer) design philosophy. The design constitutes a flash-aware storage structure, an indexing scheme, an alternative buffer management technique and a low-memory-footprint join algorithm that demonstrates improved scalability and robustness over competing solutions. With a significantly smaller memory footprint, we show that RDF4Led can handle 2 to 5 times more data than popular RDF engines such as Jena TDB (Tuple Database) and RDF4J, while consuming the same amount of memory. In particular, RDF4Led requires 10%–30% memory of its competitors to operate on datasets of up to 50 million triples. On memory-constrained ARM boards, it can perform faster updates and can scale better than Jena TDB and Virtuoso. Furthermore, we demonstrate considerably faster query operations than Jena TDB and RDF4J.

Read full abstract

The Web of Data has been gaining momentum in recent years. This leads to increasingly publish more and more semi-structured datasets following, in many cases, the RDF (Resource Description Framework) data model based on atomic triple units of subject, predicate, and object. Although it is a very simple model, specific compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requirement is even more restrictive in RDF stores because efficient SPARQL solution on the compressed RDF datasets is also required. This article introduces a novel RDF indexing technique that supports efficient SPARQL solution in compressed space. Our technique, called $$\hbox {k}^2$$k2-triples, uses the predicate to vertically partition the dataset into disjoint subsets of pairs (subject, object), one per predicate. These subsets are represented as binary matrices of subjects $$\times $$× objects in which 1-bits mean that the corresponding triple exists in the dataset. This model results in very sparse matrices, which are efficiently compressed using $$\hbox {k}^2$$k2-trees. We enhance this model with two compact indexes listing the predicates related to each different subject and object in the dataset, in order to address the specific weaknesses of vertically partitioned representations. The resulting technique not only achieves by far the most compressed representations, but also achieves the best overall performance for RDF retrieval in our experimental setup. Our approach uses up to 10 times less space than a state-of-the-art baseline and outperforms its time performance by several orders of magnitude on the most basic query patterns. In addition, we optimize traditional join algorithms on $$\hbox {k}^2$$k2-triples and define a novel one leveraging its specific features. Our experimental results show that our technique also overcomes traditional vertical partitioning for join solution, reporting the best numbers for joins in which the non-joined nodes are provided, and being competitive in most of the cases.

Read full abstract

Resource Description Framework Storage Research Articles

Related Topics

Articles published on Resource Description Framework Storage

Analyzing workload trends for boosting triple stores performance

RDF(S) Store in Object-Relational Databases

Space/time-efficient RDF stores based on circular suffix sorting

Distributed subgraph query for RDF graph data based on MapReduce

MuSe: a multi-level storage scheme for big RDF data using MapReduce

RDF for temporal data management – a survey

Temporal RDF(S) Data Storage and Query with HBase

In-memory parallelization of join queries over large ontological hierarchies

Pushing the Scalability of RDF Engines on IoT Edge Devices.

Storing and querying fuzzy RDF(S) in HBase databases

RDF DATABASES – CASE STUDY AND PERFORMANCE EVALUATION

TripleID-Q: RDF Query Processing Framework Using GPU

A survey of RDF management technologies and benchmark datasets

E-Assessment Data Compatibility Resolution Methodology with Bidirectional Data Transformation

Characterising RDF data sets

Storing massive Resource Description Framework (RDF) data: a survey

A study about integrating video contents with web services based on the RDF

An Analysis of RDF Storage Models and Query Optimization Techniques

Compressed vertical partitioning for efficient RDF management

Semantic Web repositories for genomics data using the eXframe platform

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Resource Description Framework Storage Research Articles

Related Topics

Articles published on Resource Description Framework Storage

Analyzing workload trends for boosting triple stores performance

RDF(S) Store in Object-Relational Databases

Space/time-efficient RDF stores based on circular suffix sorting

Distributed subgraph query for RDF graph data based on MapReduce

MuSe: a multi-level storage scheme for big RDF data using MapReduce

RDF for temporal data management – a survey

Temporal RDF(S) Data Storage and Query with HBase

In-memory parallelization of join queries over large ontological hierarchies

Pushing the Scalability of RDF Engines on IoT Edge Devices.

Storing and querying fuzzy RDF(S) in HBase databases

RDF DATABASES – CASE STUDY AND PERFORMANCE EVALUATION

TripleID-Q: RDF Query Processing Framework Using GPU

A survey of RDF management technologies and benchmark datasets

E-Assessment Data Compatibility Resolution Methodology with Bidirectional Data Transformation

Characterising RDF data sets

Storing massive Resource Description Framework (RDF) data: a survey

A study about integrating video contents with web services based on the RDF

An Analysis of RDF Storage Models and Query Optimization Techniques

Compressed vertical partitioning for efficient RDF management

Semantic Web repositories for genomics data using the eXframe platform