Abstract

The paper describes a pilot project to convert a conventional floristic checklist, written in a standard word processing program, into structured data in the Darwin Core Archive format. After peer-review and editorial acceptance, the final revised version of the checklist was converted into Darwin Core Archive by means of regular expressions and published thereafter in both human-readable form as traditional botanical publication and Darwin Core Archive data files. The data were published and indexed through the Global Biodiversity Information Facility (GBIF) Integrated Publishing Toolkit (IPT) and significant portions of the text of the paper were used to describe the metadata on IPT. After publication, the data will become available through the GBIF infrastructure and can be re-used on their own or collated with other data.

Highlights

  • Data mining and converting texts to structured data, especially of historical biodiversity literature, is a major challenge in biodiversity informatics

  • After peer-review and editorial acceptance, the final revised version was converted into Darwin Core Archive format from the original manuscript and published both as a conventional paper in PhytoKeys and as DwC-A structured data through the Global Biodiversity Information Facility (GBIF) Integrated Publishing Toolkit (IPT)

  • This zip file is the Darwin Core Archive itself. It was validated using the Darwin Core Archive Validator and uploaded onto the Pensoft IPT Data Hosting Center. This pilot project undertaken with the Checklist of vascular plants of the Department of Ñeembucú, Paraguay should be seen as a test of a necessary step in the process of creating a data conversion and publishing workflow for primary biodiversity data based on a new interoperability format, the Darwin Core Archive (DwC-A)

Read more

Summary

Introduction

Data mining and converting texts to structured data, especially of historical biodiversity literature, is a major challenge in biodiversity informatics. After peer-review and editorial acceptance, the final revised version was converted into Darwin Core Archive format from the original manuscript and published both as a conventional paper in PhytoKeys and as DwC-A structured data through the Global Biodiversity Information Facility (GBIF) Integrated Publishing Toolkit (IPT). The Global Biodiversity Information Facility (GBIF) and the Biodiversity Information Standards (TDWG) recently launched a new format for storing species occurrence data and taxon checklists, named Darwin Core Archive (DwC-A)

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call