The SAR11 clade, here represented by Candidatus Pelagibacter ubique, is the most successful group of bacteria in the upper surface waters of the oceans. In contrast to previous studies that have associated the 1.3 Mb genome of Ca. Pelagibacter ubique with the less than 1.5 Mb genomes of the Rickettsiales, our phylogenetic analysis suggests that Ca. Pelagibacter ubique is most closely related to soil and aquatic Alphaproteobacteria with large genomes. This implies that the SAR11 clade and the Rickettsiales have undergone genome reduction independently. A gene flux analysis of 46 representative alphaproteobacterial genomes indicates the loss of more than 800 genes in each of Ca. Pelagibacter ubique and the Rickettsiales. Consistent with their different phylogenetic affiliations, the pattern of gene loss differs with a higher loss of genes for repair and recombination processes in Ca. Pelagibacter ubique as compared with a more extensive loss of genes for biosynthetic functions in the Rickettsiales. Some of the lost genes in Ca. Pelagibacter ubique, such as mutLS, recFN, and ruvABC, are conserved in all other alphaproteobacterial genomes including the small genomes of the Rickettsiales. The mismatch repair genes mutLS are absent from all currently sequenced SAR11 genomes and also underrepresented in the global ocean metagenome data set. We hypothesize that the unique loss of genes involved in repair and recombination processes in Ca. Pelagibacter ubique has been driven by selection and that this helps explain many of the characteristics of the SAR11 population, such as the streamlined genomes, the long branch lengths, the high recombination frequencies, and the extensive sequence divergence within the population.
Read full abstract