Post-Publication Linking

Felipe Lorenz Simoes,Tatiana Ruschel,Valdenar Da Rosa Gonçalves,Carolina Sokolowicz,Jonas Castro,Julia Giora,Donat Agosti,Diego Alvares,Juliana Wingert

doi:10.3897/biss.7.110692

Abstract

One of the main challenges in biodiversity data reusability is finding ways to transform what is provided in research publications into different and reusable formats, following the FAIR (Findable, Accessible, Interoperable, Reusable) principles (Agosti and Egloff 2009). Most often, data is restricted to text, figures and tables in the so-called “PDF prison” or other flat formats. Plazi's infrastructure and workflow (Guidoti et al. 2021) transform such data into reusable formats that can then be exported and linked across different platforms, such as the Global Biodiversity Information Facility (GBIF), Biodiversity Literature Repository, Zenodo, Synospecies, ChecklistBank, and OpenBiodiv among others. In order to liberate the many relevant pieces of information, such as taxonomic treatments (Catapano 2019), material citations (Darwin Core term MaterialCitation) or bibliographic references from the publication types mentioned above, one has to run a single document or a batch of documents through a series of extraction steps, which can be done manually or automatically, through the use of templates. The latter are a set of parameters that tell the Plazi-dedicated software (GoldenGATE suite) how to read and where to find key pieces of information; these parameters are established by examining publication standards and publisher-specific layouts, followed by a series of iterative tests, to ascertain the quality of the automation. However, even with a high number of tests to ensure a better extraction, human quality control is still needed (Simoes et al. 2021). To that end, Plazi has a quality control process, based on logical rules, which checks the components of the extracted document, flagging errors in four different levels of severity, which can then be checked and corrected (if needed) by a trained user. These errors are also used in a data transit control mechanism, internally dubbed “the gatekeeper”, which blocks certain data transits to create deposits or reuse of data in the presence of specific errors. In this presentation, we will go through the steps of the entire process, from publication to liberated data (and how it is presented in the linked platforms), highlighting the importance of accurate quality control, and explore some of the many challenges along the way.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Post-Publication Linking

Abstract

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards

Lead the way for us

Journal: Biodiversity Information Science and Standards	Publication Date: Aug 8, 2023
License type: CC BY 4.0

Similar Papers

Delivering Fit-for-Use Data: Quality control
Felipe Simoes ... Donat Agosti
Biodiversity Information Science and Standards | VOL. 5
Felipe Simoes, et. al.Felipe Simoes ... Donat Agosti
20 Sep 2021
Biodiversity Information Science and Standards | VOL. 5

Best Practice for Publishing Environmental DNA (eDNA) Data According to FAIR Principles
Miwa Takahashi ... Oliver Berry
Biodiversity Information Science and Standards | VOL. 8
Miwa Takahashi, et. al.Miwa Takahashi ... Oliver Berry
14 Oct 2024
Biodiversity Information Science and Standards | VOL. 8

FAIRe Gesundheitsdaten im nationalen und internationalen Datenraum
Dagmar Waltemath ... Dagmar Krefting
Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz | VOL. 67
Dagmar Waltemath, et. al.Dagmar Waltemath ... Dagmar Krefting
15 May 2024
Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz | VOL. 67

The Open Biodiversity Knowledge Management (eco-)System: Tools and Services for Extraction, Mobilization, Handling and Re-use of Data from the Published Literature
Lyubomir Penev ... Donat Agosti
Biodiversity Information Science and Standards | VOL. 2
Lyubomir Penev, et. al.Lyubomir Penev ... Donat Agosti
17 May 2018
Biodiversity Information Science and Standards | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Post-Publication Linking

Abstract

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards