Triple Store Research Articles

IntroductionExisting methods to make data Findable, Accessible, Interoperable, and Reusable (FAIR) are usually carried out in a post hoc manner: after the research project is conducted and data are collected. De-novo FAIRification, on the other hand, incorporates the FAIRification steps in the process of a research project. In medical research, data is often collected and stored via electronic Case Report Forms (eCRFs) in Electronic Data Capture (EDC) systems. By implementing a de novo FAIRification process in such a system, the reusability and, thus, scalability of FAIRification across research projects can be greatly improved. In this study, we developed and implemented a novel method for de novo FAIRification via an EDC system. We evaluated our method by applying it to the Registry of Vascular Anomalies (VASCA). MethodsOur EDC and research project independent method ensures that eCRF data entered into an EDC system can be transformed into machine-readable, FAIR data using a semantic data model (a canonical representation of the data, based on ontology concepts and semantic web standards) and mappings from the model to questions on the eCRF. The FAIRified data are stored in a triple store and can, together with associated metadata, be accessed and queried through a FAIR Data Point. The method was implemented in Castor EDC, an EDC system, through a data transformation application. The FAIRness of the output of the method, the FAIRified data and metadata, was evaluated using the FAIR Evaluation Services. ResultsWe successfully applied our FAIRification method to the VASCA registry. Data entered on eCRFs is automatically transformed into machine-readable data and can be accessed and queried using SPARQL queries in the FAIR Data Point. Twenty-one FAIR Evaluator tests pass and one test regarding the metadata persistence policy fails, since this policy is not in place yet. ConclusionIn this study, we developed a novel method for de novo FAIRification via an EDC system. Its application in the VASCA registry and the automated FAIR evaluation show that the method can be used to make clinical research data FAIR when they are entered in an eCRF without any intervention from data management and data entry personnel. Due to the generic approach and developed tooling, we believe that our method can be used in other registries and clinical trials as well.

Read full abstract

Geospatial extensions of SPARQL, like GeoSPARQL and stSPARQL, have been defined since 2007, and while several geospatial RDF stores have implemented a substantial part of these extensions, other stores limited their support mostly on point geometry features. A parallel process with the above was that RDF frameworks evolved in an interesting way by presenting a more mature set of geospatial features, such as GeoSPARQL support and including the latest indexing technologies. As a logical consequence, a shift in the use of RDF frameworks is to be expected, from base platforms that users extend to create more complete geospatial RDF stores, to attractive finished RDF solutions for many geospatial applications. Alongside with the ever-increasing size of linked geospatial data that semantic stores need to handle, all the above provided our group the motivation to improve our single-node systems benchmark Geographica, originally defined in 2013. Geographica 2 is more comprehensive, because it now includes new geospatial RDF stores and frameworks, big real-world datasets of many hundred million triples with up to 50 million features of complex geometries, new tests and queries that reveal the scalability of these systems. The augmented and revised real-world workload of Geographica 2 tests the efficiency of primitive spatial functions in RDF stores, their performance in the geocoding scenario against the new Census dataset in addition to many other real use case scenarios and finally includes computation of statistics for geospatial datasets. A more detailed and systematic evaluation is performed using the synthetic workload. The new scalability workload aims at discovering the limits of centralized geospatial RDF stores of various architectures. It employs a set of six well-balanced real-world datasets with highly complex geometries covering many European countries and compares three RDF stores in terms of storage space, bulk loading and query response time. In addition, a special version of the benchmark has been created for systems with limited geospatial functionality and two more systems of this category are introduced along the six systems of the main benchmark, all stressed against point-only subsets of the workloads. Three out of the eight systems use an RDBMS for the persistence layer, while some of them offer a variety of persistence options.

Read full abstract

Triple Store Research Articles

Related Topics

Articles published on Triple Store

Review of: "CRAFTS: Configurable REST APIs For Triple Stores"

Sublinear Random Access Generators for Preferential Attachment Graphs

Building the Semantic Layer of the Józef Piłsudski Digital Archive With an Ontology-Based Approach

A unified metamodel for NoSQL and relational databases

Synospecies, an application to reflect changes in taxonomic names based on a triple store based on taxonomic data liberated from publication

Usage-Centric Benchmarking of RDF Triple Stores

Querying RDF Databases with Sub-CONSTRUCTs

View selection over knowledge graphs in triple stores

Knowledge Graphs

De-novo FAIRification via an Electronic Data Capture system by automated transformation of filled electronic Case Report Forms into machine-readable data

RDFFrames: knowledge graph access for machine learning tools

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Hammer lightweight graph partitioner based on graph data volumes

GSBRL : Efficient RDF graph storage based on reinforcement learning

An empirical study on the evaluation of the RDF storage systems

RAMP-TAO

The Representation of Large-Scale Graph Based on Semi-Supercised Learning

Evaluating Geospatial RDF Stores Using the Benchmark Geographica 2

A Workload-Adaptive Streaming Partitioner for Distributed Graph Stores

RDFAdaptor: Efficient ETL Plugins for RDF Data Process

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Triple Store Research Articles

Related Topics

Articles published on Triple Store

Review of: "CRAFTS: Configurable REST APIs For Triple Stores"

Sublinear Random Access Generators for Preferential Attachment Graphs

Building the Semantic Layer of the Józef Piłsudski Digital Archive With an Ontology-Based Approach

A unified metamodel for NoSQL and relational databases

Synospecies, an application to reflect changes in taxonomic names based on a triple store based on taxonomic data liberated from publication

Usage-Centric Benchmarking of RDF Triple Stores

Querying RDF Databases with Sub-CONSTRUCTs

View selection over knowledge graphs in triple stores

Knowledge Graphs

De-novo FAIRification via an Electronic Data Capture system by automated transformation of filled electronic Case Report Forms into machine-readable data

RDFFrames: knowledge graph access for machine learning tools

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Hammer lightweight graph partitioner based on graph data volumes

GSBRL : Efficient RDF graph storage based on reinforcement learning

An empirical study on the evaluation of the RDF storage systems

RAMP-TAO

The Representation of Large-Scale Graph Based on Semi-Supercised Learning

Evaluating Geospatial RDF Stores Using the Benchmark Geographica 2

A Workload-Adaptive Streaming Partitioner for Distributed Graph Stores

RDFAdaptor: Efficient ETL Plugins for RDF Data Process