LODsyndesis: Global Scale Knowledge Services

Michalis Mountantonakis,Yannis Tzitzikas

doi:10.3390/heritage1020023

Michalis Mountantonakis, Yannis Tzitzikas

Open Access

https://doi.org/10.3390/heritage1020023

Copy DOI

Journal: Heritage	Publication Date: Nov 17, 2018
Citations: 14	License type: CC BY 4.0

Affiliation: FORTH Institute of Computer Science, University of Crete

Abstract

In this paper, we present LODsyndesis, a suite of services over the datasets of the entire Linked Open Data Cloud, which offers fast, content-based dataset discovery and object co-reference. Emphasis is given on supporting scalable cross-dataset reasoning for finding all information about any entity and its provenance. Other tasks that can be benefited from these services are those related to the quality and veracity of data since the collection of all information about an entity, and the cross-dataset inference that is feasible, allows spotting the contradictions that exist, and also provides information for data cleaning or for estimating and suggesting which data are probably correct or more accurate. In addition, we will show how these services can assist the enrichment of existing datasets with more features for obtaining better predictions in machine learning tasks. Finally, we report measurements that reveal the sparsity of the current datasets, as regards their connectivity, which in turn justifies the need for advancing the current methods for data integration. Measurements focusing on the cultural domain are also included, specifically measurements over datasets using CIDOC CRM (Conceptual Reference Model), and connectivity measurements of British Museum data. The services of LODsyndesis are based on special indexes and algorithms and allow the indexing of 2 billion triples in around 80 min using a cluster of 96 computers.

Highlights

In recent years, a large volume of open data has been published and this number keeps increasing.it is necessary such open data to be Findable, Accessible, Interoperable and Reusable (FAIR; see more information for the FAIR principles in [1]), and for this reason there is an attempt for using standards and good practices, to achieve these targets
The main difficulties follow: (i) publishers tend to use different models and formats for the representation of their data; (ii) different URIs (Uniform Resource Identifiers) or languages are used for describing the same entities; (iii) publishers describe their data by using different concepts, e.g., CIDOC CRM (Conceptual Reference Model) [3] represents the birth date of a person as an event, while DBpedia [4] uses a single triple for the same fact; (iv) data from different sources can be inconsistent or conflicting; (v) a lot of complementary information occur in different sources; and (vi) many datasets are updated very frequently
We observed that publications domain is more connected comparing to the average connectivity in LOD Cloud

Summary

Introduction

A large volume of open data has been published and this number keeps increasing. In order to find all URIs and facts about an entity, say El Greco, we have to index and enrich numerous datasets, through cross-dataset inference For this reason, i.e., assisting the process of semantic integration of data at large scale, we have designed and developed novel indexes, methods and tools [5,6,7]. The major characteristic of LODsyndesis is that it indexes the whole content of hundreds of datasets in the Linked Open Data cloud, by taking into consideration the closure of equivalence relationships, and to the best of our knowledge LODsyndesis is the “largest knowledge graph of Linked Data that includes all inferred equivalence relationships” All these semantics-aware indexes are exploited, to perform fast connectivity analytics and to offer advanced connectivity services that are of primary importance for several real world tasks.

RDF and Linked Data

Related Work

Semantic Indexing Process

Performing Connectivity Analytics

LODsyndesis Services and Use Cases

How to Find the URI of an Entity

Connectivity Analytics for Publications Domain

Connectivity Analytics for British Museum

Conclusions about Connectivity of the LOD Cloud

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

LODsyndesis: Global Scale Knowledge Services

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Heritage

Lead the way for us

Similar Papers

Corago in LOD. The debut of an Opera repository into the Linked Data arena
...
JLIS.it | VOL. 12
, et. al. ...
15 May 2021
JLIS.it | VOL. 12

THE CIDOC CRM GAME: A Serious Game Approach to Ontology Learning
A Guillem ... G Bruseker
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLII-2/W5
A Guillem, et. al.A Guillem ... G Bruseker
18 Aug 2017
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLII-2/W5

The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata

AI Magazine | VOL. 24

01 Sep 2003
AI Magazine | VOL. 24

Building the Museum of the Person from RDF Triples and SPARQL
Cristiana Esteves Araújo ... Pedro Rangel Henriques
Revista ComInG - Communications and Innovations Gazette | VOL. 1
Cristiana Esteves Araújo, et. al.Cristiana Esteves Araújo ... Pedro Rangel Henriques
11 Oct 2016
Revista ComInG - Communications and Innovations Gazette | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LODsyndesis: Global Scale Knowledge Services

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Heritage