Abstract

BackgroundLatin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of these approaches in biodiversity studies. In this study, we used data mining methods to search in four main public databases of genetic sequences such as: NCBI Nucleotide and BioProject, Pathosystems Resource Integration Center, and Barcode of Life Data Systems databases. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions. Additionally, we compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru.ResultsIn Nucleotide, we found that 66.84% of total records for Colombia have been published at the national level, and this data represents less than 5% of the total number of species reported for the country. In BioProject, 70.46% of records were generated by national institutions and the great majority of them is represented by microorganisms. In BOLD Systems, 26% of records have been submitted by national institutions, representing 258 species for Colombia. This number of species reported for Colombia span approximately 0.46% of the total biodiversity reported for the country (56,343 species). Finally, in PATRIC database, 13.25% of the reported sequences were contributed by national institutions. Colombia has a better biodiversity representation in public databases in comparison to other Latin American countries, like Costa Rica and Peru. Mexico and Argentina have the highest representation of species at the national level, despite Brazil and Colombia, which actually hold the first and second places in biodiversity worldwide.ConclusionsOur findings show gaps in the representation of the Colombian biodiversity at the molecular and genetic levels in widely consulted public databases. National funding for high-throughput molecular research, NGS technologies costs, and access to genetic resources are limiting factors. This fact should be taken as an opportunity to foster the development of collaborative projects between research groups in the Latin American region to study the vast biodiversity of these countries using ‘omics’ technologies.

Highlights

  • Latin America harbors some of the most biodiverse countries in the world, including Colombia

  • We aimed to determine the amount of sequencing data of the Colombian biodiversity submitted by national institutions that is available in four main genetic sequence databases, including: Nucleotide and BioProject of the NCBI [28], Pathosystems Resource Integration Center (PATRIC) bacterial bioinformatics database [29], and Barcode of Life Data (BOLD) Systems [30]

  • This database gathers all nucleic acid sequencing data from the DNA Databank of Japan (DDBJ), the European Molecular Biology Laboratory of the European Bioinformatics Institute (EMBL-EBI), and the National Center for Biotechnology Information (NCBI). (ii) BioProject, this database gathers all biological information and data related a to a single project and allows to retrieve information through related links that is sometimes difficult to find due to inconsistent annotations, multiple independent submissions, and/or because there are diverse data types that are usually stored in different databases iii) Barcode of Life Data (BOLD) Systems, that allows to obtain data of barcode sequences from the planet’s biodiversity; and iv) The Pathosystems Resource Integration Center (PATRIC) that represents a bacterial bioinformatics database

Read more

Summary

Introduction

Latin America harbors some of the most biodiverse countries in the world, including Colombia. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions We compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru. There are approximately 56,343 species reported for Colombia, including 7385 vertebrates, 20,647 invertebrates, 1637 lichens, 2160 algae, 30,736 plants, and 1637 fungi [7] These numbers place Colombia as the second most megadiverse country worldwide without taking into account microbial species richness. In order to maintain this great biodiversity, efforts for prioritizing and carrying out conservation strategies are necessary, based on biological, ecological, systematic, and, most recently, genetic knowledge of these species [2, 9, 10]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call