Piveau: A Large-Scale Open Data Management Platform Based on Semantic Web Technologies

Fabian Kirstein,Kyriakos Stefanidis,Sebastian Urbanek,Benjamin Dittwald,Manfred Hauswirth,Simon Dutkowski

doi:10.1007/978-3-030-49461-2_38

Abstract

The publication and (re)utilization of Open Data is still facing multiple barriers on technical, organizational and legal levels. This includes limitations in interfaces, search capabilities, provision of quality information and the lack of definite standards and implementation guidelines. Many Semantic Web specifications and technologies are specifically designed to address the publication of data on the web. In addition, many official publication bodies encourage and foster the development of Open Data standards based on Semantic Web principles. However, no existing solution for managing Open Data takes full advantage of these possibilities and benefits. In this paper, we present our solution “Piveau”, a fully-fledged Open Data management solution, based on Semantic Web technologies. It harnesses a variety of standards, like RDF, DCAT, DQV, and SKOS, to overcome the barriers in Open Data publication. The solution puts a strong focus on assuring data quality and scalability. We give a detailed description of the underlying, highly scalable, service-oriented architecture, how we integrated the aforementioned standards, and used a triplestore as our primary database. We have evaluated our work in a comprehensive feature comparison to established solutions and through a practical application in a production environment, the European Data Portal. Our solution is available as Open Source.Electronic supplementary materialThe online version of this chapter (10.1007/978-3-030-49461-2_38) contains supplementary material, which is available to authorized users.

Highlights

Open Data constitutes a prospering and continuously evolving concept
In the selection process we only focused on indicators, which were applicable to measurable technical aspects that reflect the overall objective of managing metadata
In this paper we have presented our scalable Open Data management platform Piveau

Summary

Introduction

Open Data constitutes a prospering and continuously evolving concept. At the very core, this includes the publication and re-utilization of datasets. RDF is only a subset of the Semantic Web stack and Open Data publishing does not benefit from the stack’s full potential, which offers more features beyond data modeling. We developed a novel and scalable platform for managing Open Data, where the Semantic Web stack is a first-class citizen. Our work focuses on two central aspects: (1) The utilization of a variety of Semantic Web standards and technologies for covering the entire life-cycle of the Open Data publishing process. This covers data models for metadata, quality verification, reporting, harmonization, and machine-readable interfaces. We integrated a tailored microservicebased architecture and a suitable orchestration pattern to fit the requirements in an Open Data platform

Objectives

Results

Conclusion