Resource Description Framework Triples Research Articles

During the last decade, Web APIs (Application Programming Interface) have gained significant traction to the extent that they have become a de-facto standard to enable HTTP-based, machine-processable data access. Despite this success, however, they still often fail in making data interoperable, insofar as they commonly rely on proprietary data models and vocabularies that lack formal semantic descriptions essential to ensure reliable data integration. In the biodiversity domain, multiple data aggregators, such as the Global Biodiversity Information Facility (GBIF) and the Encyclopedia of Life (EoL), maintain specialized Web APIs giving access to billions of records about taxonomies, occurrences, or life traits (Triebel et al. 2012). They publish data sets spanning complementary and often overlapping regions, epochs or domains, but may also report or rely on potentially conflicting perspectives, e.g. with respect to the circumscription of taxonomic concepts. It is therefore of utmost importance for biologists and collection curators to be able to confront the knowledge they have about taxa with related data coming from third-party data sources. To tackle this issue, the French National Museum of Natural History (MNHN) has developed an application to edit TAXREF, the French taxonomic register for fauna, flora and fungus (Gargominy et al. 2018). TAXREF registers all species recorded in metropolitan France and overseas territories, accounting for 260,000+ biological taxa (200,000+ species) along with 570,000+ scientific names. The TAXREF-Web application compares data available in TAXREF with corresponding data from third-party data sources, points out disagreements and allows biologists to add, remove or amend TAXREF accordingly. This requires that TAXREF-Web developers write a specific piece of code for each considered Web API to align TAXREF representation with the Web API counterpart. This task is time-consuming and makes maintenance of the web application cumbersome. In this presentation, we report on a new implementation of TAXREF-Web that harnesses the Linked Data standards: Resource Description Framework (RDF), the Semantic Web format to represent knowledge graphs, and SPARQL, the W3C standard to query RDF graphs. In addition, we leverage the SPARQL Micro-Service architecture (Michel et al. 2018), a lightweight approach to query Web APIs using SPARQL. A SPARQL micro-service is a SPARQL endpoint that wraps a Web API service; it typically produces a small, resource-centric RDF graph by invoking the Web API and transforming the response into RDF triples. We developed SPARQL micro-services to wrap the Web APIs of GBIF, World Register of Marine Species (WoRMS), FishBase, Index Fungorum, Pan-European Species directories Infrastructure (PESI), ZooBank, International Plant Names Index (IPNI), EoL, Tropicos and Sandre. These micro-services consistently translate Web APIs responses into RDF graphs utilizing mainly two well-adopted vocabularies: Schema.org (Guha et al. 2015) and Darwin Core (Baskauf et al. 2015). This approach brings about two major advantages. First, the large adoption of Schema.org and Darwin Core ensures that the services can be immediately understood and reused by a large audience within the biodiversity community. Second, wrapping all these Web APIs in SPARQL micro-services “suddenly” makes them technically and semantically interoperable, since they all represent resources (taxa, habitats, traits, etc.) in a common manner. Consequently, the integration task is simplified: confronting data from multiple sources essentially consists of writing the appropriate SPARQL queries, thus making easier web application development and maintenance. We present several concrete cases in which we use this approach to detect disagreements between TAXREF and the aforementioned data sources, with respect to taxonomic information (author, synonymy, vernacular names, classification, taxonomic rank), habitats, bibliographic references, species interactions and life traits.

Read full abstract

Web APIs (Application Programming Interface) are a common means for Web portals and data producers to enable HTTP-based, machine-processable access to their data. They are a prominent source of information*1 pertaining to topics as diverse as scientific information, social networks, entertainment or finance. The methods of Linked Data (Heath and Bizer 2011) similarly aim to publish machine-readable data on the Web, while connecting related resources within and between datasets, thereby creating a large distributed knowledge graph. Today, the biodiversity community is increasingly adopting the Linked Data principles to publish data such as trait banks, museum collections and taxonomic registers (Parr et al. 2016, Baskauf et al. 2016). However, standard approaches are still missing to combine disparate representations coming from both Linked Data interfaces and the manifold Web APIs that were developed during the last two decades to expose legacy biodiversity databases on the Web.TheSPARQL Micro-Servicearchitecture (Michel et al. 2018) tackles the goal of reconciling Linked Data interfaces and Web APIs. It proposes a lightweight method to query a Web API using SPARQL (Harris and Seaborne 2013), the Semantic Web standard to query knowledge graphs expressed in the Resource Description Framework (RDF). A SPARQL micro-service provides access to a small RDF graph, typically resource-centric, that it builds at run-time by transforming a fraction of the whole dataset served by the Web API into RDF triples. Furthermore, Web APIs traditionally rely on internal, proprietary resource identifiers that are unsuited for use as Uniform Resource Identifiers (URIs). To address this concern, a SPARQL micro-service can assign a URI to a Web API resource, allowing an application to look up this URI and get a description of the resource in return (this process is referred to asdereferencing).In this demo, we wish to showcase the value of SPARQL micro-services in the biodiversity domain. We first query TAXREF-LD, a Linked Data representation of the French taxonomic register of living beings (Michel et al. 2017), to retrieve information about a given taxon. Then, we demonstrate how we can enrich our knowledge about this taxon with various types of data retrieved on-the-fly from multiple Web APIs:trait data from the Encyclopedia of Life trait bank (Parr et al. 2016),articles or books from the Biodiversity Heritage Library,audio recordings from the Macaulay scientific media archive,photos from the Flickr photography social network, andmusic tunes from MusicBrainz.Different visualizations are demonstrated, ranging from raw RDF triples to Web pages generated dynamically and integrating heterogeneous data, as suggested in Fig. 1. Depending on the audience’s interests, we shall touch upon the alignment of Web APIs’ proprietary vocabularies with well-adopted thesauri or ontologies, or more technical concernse.g.related to the effort required to deploy a new SPARQL micro-service.

Read full abstract

Resource Description Framework Triples Research Articles

Related Topics

Articles published on Resource Description Framework Triples

JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique

Mining Sematic Association Rules from RDF Data

FHIR-Ontop-OMOP: Building clinical knowledge graphs in FHIR RDF with the OMOP Common data Model

Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish

A unified ontology-based data integration approach for the internet of things

K-LM: Knowledge Augmenting in Language Models Within the Scholarly Domain

LODQuMa: A Free-ontology process for Linked (Open) Data quality management

Assessing Large-Scale, Cross-Domain Knowledge Bases for Semantic Search

OPAL: An extensible framework for ontology‐based program analysis

MESRG: multi-entity summarisation in RDF graph

An integration approach of multi-source heterogeneous fuzzy spatiotemporal data based on RDF

Assisting Biologists in Editing Taxonomic Information by Confronting Multiple Data Sources using Linked Data Standards

Indexing temporal RDF graph

Citrus ontology development based on the eight-point charter of agriculture

Transforming XML to RDF(S) with Temporal Information

Integration of Biodiversity Linked Data and Web APIs using SPARQL Micro-Services

Pragmatic thought as a philosophical foundation for collaborative tagging and the Semantic Web

GrandBase: generating actionable knowledge from Big Data

Htab2RDF: Mapping HTML Tables to RDF Triples

A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Resource Description Framework Triples Research Articles

Related Topics

Articles published on Resource Description Framework Triples

JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique

Mining Sematic Association Rules from RDF Data

FHIR-Ontop-OMOP: Building clinical knowledge graphs in FHIR RDF with the OMOP Common data Model

Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish

A unified ontology-based data integration approach for the internet of things

K-LM: Knowledge Augmenting in Language Models Within the Scholarly Domain

LODQuMa: A Free-ontology process for Linked (Open) Data quality management

Assessing Large-Scale, Cross-Domain Knowledge Bases for Semantic Search

OPAL: An extensible framework for ontology‐based program analysis

MESRG: multi-entity summarisation in RDF graph

An integration approach of multi-source heterogeneous fuzzy spatiotemporal data based on RDF

Assisting Biologists in Editing Taxonomic Information by Confronting Multiple Data Sources using Linked Data Standards

Indexing temporal RDF graph

Citrus ontology development based on the eight-point charter of agriculture

Transforming XML to RDF(S) with Temporal Information

Integration of Biodiversity Linked Data and Web APIs using SPARQL Micro-Services

Pragmatic thought as a philosophical foundation for collaborative tagging and the Semantic Web

GrandBase: generating actionable knowledge from Big Data

Htab2RDF: Mapping HTML Tables to RDF Triples

A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT