Abstract

The Darwin Core Archive (DwC-A) format, based on the Darwin Core standard (Wieczorek et al. 2012), facilitates the exchange, management, and integration of biodiversity data from multiple sources. This ability to collate biodiversity data allows datasets to be aggregated at community-supported infrastructures, merged in different combinations, meta-analyzed and submitted to public repositories (Baker et al. 2014). Thus, the DwC-As serve as unifying archives in concatenated collective efforts, such as biodiversity inventories at different spatial and taxonomic scales. Here we describe PyDwCA*1, 2, a Python library implemented to handle the "star scheme" of DwC-A. This new library reads compressed zip files containing the expected meta.xml and uses it to assign the core component and its extensions. It also provides Python classes to define the core, the extensions, and the metadata file for creating an archive and writing it into a compressed zip file. PyDwCA also implements functionality to select, filter and merge DwC-A files. We present this new tool in the context of the construction of the Chilean National Biodiversity Inventory (Fig. 1), but PyDwCA serves as a versatile technical solution applicable to different contexts in the field of biodiversity informatics (e.g., integration of datasets from biological collection and sampling events). To exemplify how PyDwCA works, we present the step-by-step integration of the Chilean Catalogue of Vascular Plants (Rodriguez et al. 2018) on a matrix provided by the Catalogue of Life (Banki 2024), filtered with the species with occurrences recorded for Chile in the Global Biodiversity Information Facility (GBIF) (GBIF.Org 2023).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.