Abstract

The Global Genome Initiative (GGI) endeavors to collect the Earth’s genomic biodiversity, preserve this biodiversity as high quality genetic resources in Global Genome Biodiversity Network (GGBN) affiliated biorepositories, increase knowledge of biodiversity through genetic sequencing, and make resources and knowledge accessible to researchers via the GGBN Data Portal, the Global Catalogue of Microorganisms (GCM), and the National Center for Biotechnology Information (NCBI) GenBank. In GGI's seven year timespan, it is attempting to collect samples from all 9,870 families and half of the 165,683 genera of life on Earth (Roskov et al. 2019). To accomplish this, GGI must synergistically consider the following questions: What life exists? What has already been preserved as physical resources? What is already known from genetic sequencing? How will novel or legacy collections fill the gaps in resources or knowledge? What life exists? What has already been preserved as physical resources? What is already known from genetic sequencing? How will novel or legacy collections fill the gaps in resources or knowledge? To answer the first question, GGI has explored the use of taxonomic authorities such as the Global Biodiversity Information Facility (GBIF) Backbone Taxonomy and the Catalogue of Life as taxonomic backbones to variously match taxonomic names and derive complete lists of extant taxa at each taxonomic rank. To answer the second question, GGI utilizes the GGBN Data Portal API and the GCM website to extract lists of taxonomic names, which are then standardized to a taxonomic backbone. To answer the third question, following the recommendations of Hanner 2009 for identifying high-quality DNA barcode records, GGI employs the NCBI Entrez Programming Utilities to download GenBank records, then standardizes the associated taxa to a taxonomic backbone. Finally, GGI compares lists of taxa found in specific geographic areas or specific legacy collections to determine the amount of taxonomic novelty a new collection may supply. GGI refers to this comparison of taxonomic databases as a taxonomic gap analysis, an assessment of how well a potential collection fills the taxonomic gaps in physical collections and genetic knowledge. A gap analysis performed by GGI in March 2019 shows that 49% of families and 78% of genera still have no representation as either physical samples or genetic information (Table 1). There are substantial gaps to fill in the endeavor to capture the Earth's biodiversity, and taxonomic gap analysis will continue to be a powerful tool to identify the most promising potential new collections.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call