Abstract

Botanic gardens are an invaluable refuge for plant diversity for conservation, education and research. Worldwide, they manage over 100,000 species, roughly 30% of all plant species diversity, and over 41% of known threatened species; the botanic gardens in Germany house approximately 50,000 different species (Marquardt et al. in press). Scientists in need of plant material rely upon these resources for their research; they require a pooled, up-to-date inventory of ideally all accessions of these gardens. Sharing data from (living) specimen collections online has become routine in the past years; initiatives like PlantSearch of Botanic Gardens Conservation International and the Global Biodiversity Information Facility (GBIF) allow requesting specimens of interest. However, these catalogues are accessible for everyone. Legitimate concerns about potential theft and legal issues keep curators of living collections from sharing their full catalogues; in most cases, only filtered views of the data will be fed into these networks. Gardens4Science (http://gardens4science.biocase.org) aims at overcoming this issue by creating a trusted network between botanic gardens that allows an unfiltered access on the constituents’ accession catalogues. This unified data pool needs to be automatically synchronized with the individual garden’s catalogues, irrespective of the collection management systems used locally. For the three-year construction phase of Gardens4Science, focus is on Cactaceae and Bromeliaceae, since these families are well-represented in the collections and ideal models for studying the origin of biodiversity on evolutionary time scale. Gardens4Science’s technical architecture (Fig. 1) is based on existing tools for setting up biodiversity networks: The BioCASe (Biological Collections Access Service) Provider Software acts as an interface to the local databases that shields the network from their peculiarities (database management systems and data models used). BioCASe transforms the data into the Access to Biological Collections Data schema (ABCD) and publishes them as a BioCASe-compliant web service (Holetschek and Döring 2008, Holetschek et al. 2012). The data portal is based on portal software from the Global Genome Biodiversity Network and provides a user-specific view on the data. Registered trusted users will be able to display full details of individual accessions, whereas guest users will see only an aggregated view (Droege et al. 2014). The Berlin Indexing and Harvesting Toolkit (B-HIT) is used for harvesting the BioCASe web services of the local catalogues and creating a unified index database (Kelbert et al. 2015). Harvesting is done in regular intervals in order to keep the index in sync with the source databases and does not require any action on the provider’s side. The BioCASe (Biological Collections Access Service) Provider Software acts as an interface to the local databases that shields the network from their peculiarities (database management systems and data models used). BioCASe transforms the data into the Access to Biological Collections Data schema (ABCD) and publishes them as a BioCASe-compliant web service (Holetschek and Döring 2008, Holetschek et al. 2012). The data portal is based on portal software from the Global Genome Biodiversity Network and provides a user-specific view on the data. Registered trusted users will be able to display full details of individual accessions, whereas guest users will see only an aggregated view (Droege et al. 2014). The Berlin Indexing and Harvesting Toolkit (B-HIT) is used for harvesting the BioCASe web services of the local catalogues and creating a unified index database (Kelbert et al. 2015). Harvesting is done in regular intervals in order to keep the index in sync with the source databases and does not require any action on the provider’s side. In addition to harvesting, B-HIT performs several data cleaning steps. Foremost, it reconciles scientific names from the source databases with a taxonomic backbone (currently caryophyllales.org for Cactaceae and the Butcher and Gouda checklist for Bromeliaceae), which allows harmonizing the taxonomies from the different sources and the correction of outdated species names and orthographic mistakes. Provenance information are validated (for example specified geographic coordinates versus country) and corrected, if possible; date values are parsed and converted into a standard format. The issues found and potential corrections are compiled in reports and send to the curators, so the mistakes can be rectified in the source databases. In the construction phase, Gardens4Science consists of seven German Botanic gardens that share their accessions of the Bromeliaceae and Cactaceae families. Up to now (March 2019), 19.539 records have been published in Evo-BoGa, with about 3,500 to be added until the end of the project in January 2020. After the construction phase, it is planned to extend the network to include more Botanic Gardens – both from Germany and other countries – as well as additional plant families.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call