Abstract

During the last decade, numerous governmental, educational or cultural institutions have launched Open Data initiatives that have facilitated the access to large volumes of datasets on the web. The main way to disseminate this availability of data has been the deployment of Open Data catalogs exposing metadata of these datasets, which are easily indexed by web search engines. Open Source platforms have facilitated enormously the labor of institutions involved in Open Data initiatives, making the setup of Open Data portals almost a trivial task. However, few approaches have analyzed how precisely metadata describes the associated datasets. Taking into account the existing approaches for analyzing the quality of metadata in the Open Data context and other related domains, this work contributes to the state of the art by extending an ISO 19157 based method for checking the quality of geographic metadata to the context of Open Data metadata. Focusing on metadata models compliant with the Data Catalog Vocabulary proposed by W3C, the proposed extended method has been applied for the evaluation of the Open Data catalog of the Spanish Government. The results have been also compared with those obtained by the Metadata Quality Assessment methodology proposed at the European Data Portal.

Highlights

  • With the increasing interest in facilitating government transparency or public participation, many governments have launched Open Data initiatives to release their data on the web [1]

  • The main mechanism to disseminate this availability of data has been the deployment of Open Data catalogs exposing metadata of these datasets, which are indexed by general web search engines or specialized dataset search engines like Google Dataset Search

  • In this paper, we have proposed a new method for the evaluation of the quality of Open Data metadata based on ISO 19157

Read more

Summary

Introduction

With the increasing interest in facilitating government transparency or public participation, many governments have launched Open Data initiatives to release their data on the web [1]. The main mechanism to disseminate this availability of data has been the deployment of Open Data catalogs exposing metadata of these datasets, which are indexed by general web search engines or specialized dataset search engines like Google Dataset Search.. As already mentioned in the introduction, DCAT-AP is an application profile of DCAT for describing the datasets and distributions published on Open Data portals that adds additional constraints on the metadata properties: minimum and maximum multiplicity of properties and stricter ranges. These constrains are important if we aim to evaluate metadata quality aspects such as completeness or consistency

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call