Abstract Mutations are fundamental for evolution, and their mathematical modelling in population genetics heavily relies on our perception of their frequency and the timescale over which they occur. A common assumption is that mutations are infrequent, so when a new mutation arises, the previous one has either become fixed or lost. This assumption implies that mutations occur exclusively at fixed sites and is referred to as the boundary mutation model. However, one can alternatively assume a recurrent mutation model, which additionally considers mutations contributing to shifts in allele frequency. In this study, we compare these two models. By examining mutation rates and effective population sizes across the Tree of Life, we demonstrate that the boundary mutation model remains valid for most species but significantly deviates in bacteria. Our analyses further reveal that the boundary mutation model tends to overestimate the effective population size, particularly in bacteria, where estimated population sizes can be more than five times larger than those expected by the recurrent mutation model. We address these biases by proposing a Bayesian estimator for population size that accounts for recurrent mutations. To illustrate how mutation models can influence the quantification of forces other than drift, we further present a case study showing that the boundary mutation model exaggerates the intensity of selective constraints acting on the three codon positions of the bacterium Pseudomonas fluorescens. This study emphasizes the importance of considering recurrent mutations in highly diverse species for accurate population genetics inference.
Read full abstract