Bacterial genomes primarily diversify via gain, loss, and rearrangement of genetic material in their flexible accessory genome. Yet the dynamics of accessory genome evolution are very poorly understood, in contrast to the core genome where diversification is readily described by mutations and homologous recombination. Here, we tackle this problem for the case of very closely related genomes. We comprehensively describe genome evolution within n=222 genomes of Escherichia coli ST131, which likely shared a common ancestor around 100 years ago. After removing putative recombinant diversity, the total length of the phylogeny is 6,000 core genome substitutions. Within this diversity, we find 22 modifications to core genome synteny and estimate around 2,000 structural changes within the accessory genome, i.e. one structural change for every three core genome substitutions. Sixty-three percent of loci with structural diversity could be resolved into individual gain and loss events with 10-fold more gains than losses, demonstrating a dominance of gains due to insertion sequences and prophage integration. Our results suggest the majority of synteny changes and insertions in our dataset are likely deleterious and only persist for a short time before being removed by purifying selection.
Read full abstract