Abstract

Bacterial genomics has revolutionized our understanding of the microbial tree of life; however, mapping and visualizing the distribution of functional traits across bacteria remains a challenge. Here, we introduce AnnoTree—an interactive, functionally annotated bacterial tree of life that integrates taxonomic, phylogenetic and functional annotation data from over 27 000 bacterial and 1500 archaeal genomes. AnnoTree enables visualization of millions of precomputed genome annotations across the bacterial and archaeal phylogenies, thereby allowing users to explore gene distributions as well as patterns of gene gain and loss in prokaryotes. Using AnnoTree, we examined the phylogenomic distributions of 28 311 gene/protein families, and measured their phylogenetic conservation, patchiness, and lineage-specificity within bacteria. Our analyses revealed widespread phylogenetic patchiness among bacterial gene families, reflecting the dynamic evolution of prokaryotic genomes. Genes involved in phage infection/defense, mobile elements, and antibiotic resistance dominated the list of most patchy traits, as well as numerous intriguing metabolic enzymes that appear to have undergone frequent horizontal transfer. We anticipate that AnnoTree will be a valuable resource for exploring prokaryotic gene histories, and will act as a catalyst for biological and evolutionary hypothesis generation. AnnoTree is freely available at http://annotree.uwaterloo.ca

Highlights

  • Important biological and evolutionary insights can be generated by exploring the presence/absence of genes and functional annotations across species phylogenies

  • Protein sequences, and functional annotations are stored in a backend MySQL database for rapid retrieval by the front-end AnnoTree application (Figure 1)

  • As an initial exploration of the data within AnnoTree, we examined the distributions of all 77 004 395 bacterial Pfam and KEGG Orthology (KO) annotations when mapped onto the bacterial Genome Taxonomy Database (GTDB) tree of life (Release 02-RS83)

Read more

Summary

Introduction

Important biological and evolutionary insights can be generated by exploring the presence/absence of genes and functional annotations across species phylogenies. With the ongoing exponential increase in available genome sequences, including information from previously uncharacterized and uncultured lineages, online genomic repositories are becoming increasingly valuable collections of predicted genes and functional annotations. With this wealth of genomic data comes the opportunity for large-scale examinations of gene family distributions and evolutionary histories, but databases are not accessed, updated, or visualized. There is a need for tools that allow users to explore gene/function distributions across a taxonomically curated and highly resolved tree of life

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call