Abstract

The South American continent presents a great diversity of biomes, whose ecosystems are constantly threatened by the expansion of human activity. The emergence and re-emergence of viral populations with impact on the human population and ecosystem have shown increases in the last decades. In deference to the growing accumulation of genomic data, we explore the potential of South American-related public databases to detect signals that contribute to virosphere research. Therefore, our study aims to investigate public databases with emphasis on the surveillance of viruses with medical and ecological relevance. Herein, we profiled 120 “sequence read archives” metagenomes from 19 independent projects from the last decade. In a coarse view, our analyses identified only 0.38% of the total number of sequences from viruses, showing a higher proportion of RNA viruses. The metagenomes with the most important viral sequences in the analyzed environmental models were 1) aquatic samples from the Amazon River, 2) sewage from Brasilia, and 3) soil from the state of São Paulo, while the models of animal transmission were detected in mosquitoes from Rio Janeiro and Bats from Amazonia. Also, the classification of viral signals into operational taxonomic units (OTUs) (family) allowed us to infer from metadata a probable host range in the virome detected in each sample analyzed. Further, several motifs and viral sequences are related to specific viruses with emergence potential from Togaviridae, Arenaviridae, and Flaviviridae families. In this context, the exploration of public databases allowed us to evaluate the scope and informative capacity of sequences from third-party public databases and to detect signals related to viruses of clinical or environmental importance, which allowed us to infer traits associated with probable transmission routes or signals of ecological disequilibrium. The evaluation of our results showed that in most cases the size and type of the reference database, the percentage of guanine–cytosine (GC), and the length of the query sequences greatly influence the taxonomic classification of the sequences. In sum, our findings describe how the exploration of public genomic data can be exploited as an approach for epidemiological surveillance and the understanding of the virosphere.

Highlights

  • Viral emergencies showed a progressive increase in the last decades (Jones et al, 2008; Gould and Higgs, 2009; Ewald, 2011; Coffey et al, 2014; Nobre et al, 2016)

  • The highest proportion was observed in the dsRNA category from the Aedes aegypti samples (Figure 2) and corresponded to sequences identified as Reoviridae, Picobirnaviridae, Partitiviridae, Hypovirus, and Sedoreovirinae. This uneven distribution of sequences according to the genome in Aedes sp. samples could be biased by the methods to obtain and concentrate samples

  • The viral sequences corresponding to ssDNA, dsRNA, and dsDNA-RT classes represented the lowest frequencies that we found

Read more

Summary

Introduction

Viral emergencies showed a progressive increase in the last decades (Jones et al, 2008; Gould and Higgs, 2009; Ewald, 2011; Coffey et al, 2014; Nobre et al, 2016). Accumulated evidence reveals an overwhelming number of new viruses and routes of interaction that were not considered a few decades ago (Wong et al, 2007; Sobel Leonard et al, 2017; Nouri et al, 2018; Schoeman and Fielding, 2019). This impressive volume of new viral sequences allowed us to investigate the still hidden viral diversity that has great magnitude and actively participates in ecological processes (Koonin and Dolja, 2013; Posada-Cespedes and Seifert, 2017). The viral proportions observed to date are only a small fraction of the virosphere volume (Zhang et al, 2018)

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.