Resource Description Framework Data Model Research Articles

The Web of Data has been gaining momentum in recent years. This leads to increasingly publish more and more semi-structured datasets following, in many cases, the RDF (Resource Description Framework) data model based on atomic triple units of subject, predicate, and object. Although it is a very simple model, specific compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requirement is even more restrictive in RDF stores because efficient SPARQL solution on the compressed RDF datasets is also required. This article introduces a novel RDF indexing technique that supports efficient SPARQL solution in compressed space. Our technique, called $$\hbox {k}^2$$k2-triples, uses the predicate to vertically partition the dataset into disjoint subsets of pairs (subject, object), one per predicate. These subsets are represented as binary matrices of subjects $$\times $$× objects in which 1-bits mean that the corresponding triple exists in the dataset. This model results in very sparse matrices, which are efficiently compressed using $$\hbox {k}^2$$k2-trees. We enhance this model with two compact indexes listing the predicates related to each different subject and object in the dataset, in order to address the specific weaknesses of vertically partitioned representations. The resulting technique not only achieves by far the most compressed representations, but also achieves the best overall performance for RDF retrieval in our experimental setup. Our approach uses up to 10 times less space than a state-of-the-art baseline and outperforms its time performance by several orders of magnitude on the most basic query patterns. In addition, we optimize traditional join algorithms on $$\hbox {k}^2$$k2-triples and define a novel one leveraging its specific features. Our experimental results show that our technique also overcomes traditional vertical partitioning for join solution, reporting the best numbers for joins in which the non-joined nodes are provided, and being competitive in most of the cases.

Read full abstract

BackgroundNeuroscientists often need to access a wide range of data sets distributed over the Internet. These data sets, however, are typically neither integrated nor interoperable, resulting in a barrier to answering complex neuroscience research questions. Domain ontologies can enable the querying heterogeneous data sets, but they are not sufficient for neuroscience since the data of interest commonly span multiple research domains. To this end, e-Neuroscience seeks to provide an integrated platform for neuroscientists to discover new knowledge through seamless integration of the very diverse types of neuroscience data. Here we present a Semantic Web approach to building this e-Neuroscience framework by using the Resource Description Framework (RDF) and its vocabulary description language, RDF Schema (RDFS), as a standard data model to facilitate both representation and integration of the data.ResultsWe have constructed a pilot ontology for BrainPharm (a subset of SenseLab) using RDFS and then converted a subset of the BrainPharm data into RDF according to the ontological structure. We have also integrated the converted BrainPharm data with existing RDF hypothesis and publication data from a pilot version of SWAN (Semantic Web Applications in Neuromedicine). Our implementation uses the RDF Data Model in Oracle Database 10g release 2 for data integration, query, and inference, while our Web interface allows users to query the data and retrieve the results in a convenient fashion.ConclusionAccessing and integrating biomedical data which cuts across multiple disciplines will be increasingly indispensable and beneficial to neuroscience researchers. The Semantic Web approach we undertook has demonstrated a promising way to semantically integrate data sets created independently. It also shows how advanced queries and inferences can be performed over the integrated data, which are hard to achieve using traditional data integration approaches. Our pilot results suggest that our Semantic Web approach is suitable for realizing e-Neuroscience and generic enough to be applied in other biomedical fields.

Read full abstract

Resource Description Framework Data Model Research Articles

Related Topics

Articles published on Resource Description Framework Data Model

Constructing a knowledge graph for open government data: the case of Nova Scotia disease datasets

Hybrid data model of PACE and quadruple: an efficient data model for cloud computing

Hybrid data model of PACE and quadruple: an efficient data model for cloud computing

Efficiently Processing and Storing Library Linked Data using Apache Spark and Parquet

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

An Analysis of RDF Storage Models and Query Optimization Techniques

Compressed vertical partitioning for efficient RDF management

TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

Semantic Retrieval Based on SPARQL and Fuzzy Ontology for Electronic Commerce

Foundations of Semantic Web databases

Can Bibliographic Data be Put Directly onto the Semantic Web?

Fuzzy Semantic Retrieval for Traffic Information Based on Fuzzy Ontology and RDF on the Semantic Web

N3Logic: A logical framework for the World Wide Web

Product Life-Cycle Metadata Modeling and Its Application with RDF

AlzPharm: integration of neurodegeneration data using RDF

TRANSFORMATION FROM SEMANTIC DATA MODEL TO RDF

Automatic RDF metadata generation for resource discovery

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Resource Description Framework Data Model Research Articles

Related Topics

Articles published on Resource Description Framework Data Model

Constructing a knowledge graph for open government data: the case of Nova Scotia disease datasets

Hybrid data model of PACE and quadruple: an efficient data model for cloud computing

Hybrid data model of PACE and quadruple: an efficient data model for cloud computing

Efficiently Processing and Storing Library Linked Data using Apache Spark and Parquet

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

An Analysis of RDF Storage Models and Query Optimization Techniques

Compressed vertical partitioning for efficient RDF management

TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

Semantic Retrieval Based on SPARQL and Fuzzy Ontology for Electronic Commerce

Foundations of Semantic Web databases

Can Bibliographic Data be Put Directly onto the Semantic Web?

Fuzzy Semantic Retrieval for Traffic Information Based on Fuzzy Ontology and RDF on the Semantic Web

N3Logic: A logical framework for the World Wide Web

Product Life-Cycle Metadata Modeling and Its Application with RDF

AlzPharm: integration of neurodegeneration data using RDF

TRANSFORMATION FROM SEMANTIC DATA MODEL TO RDF

Automatic RDF metadata generation for resource discovery