Abstract

Our view of genome size in Archaea and Bacteria has remained skewed as the data has been dominated by genomes of microorganisms that have been cultivated under laboratory settings. However, the continuous effort to catalog Earth’s microbiomes, specifically propelled by recent extensive work on uncultivated microorganisms, provides an opportunity to revise our perspective on genome size distribution. We present a meta-analysis that includes 26,101 representative genomes from 3 published genomic databases; metagenomic assembled genomes (MAGs) from GEMs and stratfreshDB, and isolates from GTDB. Aquatic and host-associated microbial genomes present on average the smallest estimated genome sizes (3.1 and 3.0 Mbp, respectively). These are followed by terrestrial microbial genomes (average 3.7 Mbp), and genomes from isolated microorganisms (average 4.3 Mbp). On the one hand, aquatic and host-associated ecosystems present smaller genomes sizes in genera of phyla with genome sizes above 3 Mbp. On the other hand, estimated genome size in phyla with genomes under 3 Mbp showed no difference between ecosystems. Moreover, we observed that when using 95% average nucleotide identity (ANI) as an estimator for genetic units, only 3% of MAGs cluster together with genomes from isolated microorganisms. Although there are potential methodological limitations when assembling and binning MAGs, we found that in genome clusters containing both environmental MAGs and isolate genomes, MAGs were estimated only an average 3.7% smaller than isolate genomes. Even when assembly and binning methods introduce biases, estimated genome size of MAGs and isolates are very similar. Finally, to better understand the ecological drivers of genome size, we discuss on the known and the overlooked factors that influence genome size in different ecosystems, phylogenetic groups, and trophic strategies.

Highlights

  • As microbiologists, how do we define what is a small or a big genome? Perhaps, researchers working on model organisms such as Escherichia coli with a genome size of ∼5 Mbp (Abram et al, 2021) would define “big” or “small” differently to researchers working on soil-dwelling bacteria with a genome size of 16 Mbp (Garcia et al, 2014)

  • We found that 76.3% of representative archaeal and bacterial genomes recovered through genome-resolved metagenomics present estimated genome sizes below 4 Mbp

  • This review offers a broad overview of genome size distribution across three different ecosystem categories, showing that metagenome assembled genomes (MAGs) recovered from aquatic and host-associated ecosystems present smaller estimated genome sizes than those recovered from terrestrial ecosystems

Read more

Summary

Introduction

How do we define what is a small or a big genome? Perhaps, researchers working on model organisms such as Escherichia coli with a genome size of ∼5 Mbp (Abram et al, 2021) would define “big” or “small” differently to researchers working on soil-dwelling bacteria with a genome size of 16 Mbp (Garcia et al, 2014). It is known that genome sizes of Archaea and Bacteria range between 100 kbp and 16 Mbp, but the genome size distribution in nature is still undefined. The aim of this review is to provide an overview of the distribution of genome sizes in different ecosystems. We leveraged recently published databases of archaeal and bacterial metagenome assembled genomes (MAGs) (Nayfach et al, 2020; Buck et al, 2021a) together with isolate genomes to revisit and acquire an updated understanding of the estimated genome size distribution across different ecosystems. We found that 76.3% of representative archaeal and bacterial genomes recovered through genome-resolved metagenomics present estimated genome sizes below 4 Mbp. all MAGs from five archaeal phyla (Micrarcheota, Ianarchaeota, Undinarchaeota, Nanohaloarchaeota, and Hadarchaeota) and two bacterial phyla (Coprothermobacterota and Dictyoglomota) were recovered exclusively from aquatic ecosystems and have genome sizes below 2 Mbp (Figures 1A,B)

Objectives
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call