Abstract

An important part of Open Data is of statistical nature and describes economic and social indicators monitoring population size, inflation, trade, and employment. Combining and analysing Open Data from multiple datasets and sources enable the performance of advanced data analytics scenarios that could result in valuable services and data products. However, it is still difficult to discover and combine open statistical data that reside in different data portals. Although Linked Open Statistical Data (LOSD) provide standards and approaches to facilitate combining statistics on the Web, various interoperability challenges still exist. In this paper, we define interoperability conflicts that hamper combining and analysing LOSD from different portals. Towards this end, we start from a thorough literature review on databases and data warehouses interoperability conflicts. Based on this review, we define interoperability conflicts that may appear in LOSD. We defined two types of schema-level conflicts namely, naming conflicts and structural conflicts. Naming conflicts include homonyms and synonyms and result from the different URIs used in the data cubes. Structural conflicts result from different practices of modelling the structure of data cubes.

Highlights

  • During the last years, an increasing number of governments, public authorities, and companies have opened up their data providing a vast amount of Open Data through numerous portals [18]

  • Interoperability among data cubes is crucial to unleash the full potential of linked statistical data

  • It will enable performing combined analytics and visualizations on data published by different national statistics offices and other organisations

Read more

Summary

Introduction

An increasing number of governments, public authorities, and companies have opened up their data providing a vast amount of Open Data through numerous portals [18]. Integrating data from different sources will unleash the full potential of Open Data [27, 35, 26, 33] This will enable, for example, performing combined analytics on top of data published by different national statistics offices [17]. An important step towards this direction is the RDF data cube (QB) vocabulary [11], which enables modelling Linked Open Statistical Data (LOSD) in a standardised manner. The aim of this paper is to define interoperability conflicts that hamper combining and analysing LOSD from different data portals. To this end, we first identify interoperability conflicts of databases and data warehouses using a thorough literature review and, subsequently, map those conflicts to LOSD interoperability conflicts.

Research approach
Background
The Data Cube model
Linked Statistical Data
Portals with Linked Statistical Data
Schema-level conflicts Schema-level conflicts are classified into naming and structural conflicts
Data-level conflicts
Naming conflicts
Structural conflicts
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call