On Metadata Quality in Sceiba, a Platform for Quality Control and Monitoring of Cuban Scientific Publications

Eduardo Arencibia,Yohannis Marti-Lahera,Rafael Martinez,Marc Goovaerts

doi:10.1007/978-3-030-98876-0_9

Abstract

AbstractIt is introduced a platform for quality control and monitoring of Cuban scientific publications named Sceiba. To this end, it needs to collect scientific publications comprehensively at the national level. Metadata quality is crucial for Sceiba interoperability and development. This paper exposes how metadata quality is assured and enhanced in Sceiba. The metadata aggregation pipeline is worked out to collect, transform, store and expose metadata on Persons, Organizations, Sources, and Scientific Publications. Raw data transformation into Sceiba’s internal metadata models includes cleaning, disambiguation, deduplication, entity linking, validation, standardization, and enrichment using a semi-automated approach aligned with the findability, accessibility, interoperability, and reusability principles. To meet the requirements of metadata quality in Sceiba, a three-layer structure for metadata is used, including 1) discovery metadata, which allows the discovery of relevant scientific publications by browsing or query, 2) contextual metadata, which allows a) rich information on persons, organizations and other aspects associated with publications, b) interoperation among common metadata formats used in Current Research Information Systems, journals systems or Institutional Repositories; 3) detailed metadata, which is specific to the domain of scientific publication evaluation. The example provided shows how the metadata quality is improved in the Identification System for Cuban Research Organizations, one of Sceiba´s component applications.KeywordsCurrent research information systemMetadata qualityScientific publication quality

Full Text