Skews in the observed allele-frequency spectrum are frequently viewed as an indication of non-neutral evolution. Recent surveys of microsatellite variability have used an excess of alleles as a statistical approach to infer positive selection. Using neutral coalescent simulations we demonstrate that the mean numbers of alleles expected under the stepwise-mutation model and infinite-allele model deviate from the observed numbers of alleles. The magnitude of this difference is dependent on the sample size, mutation rates (theta-values) and observed gene diversities. Moreover, we show that the number of observed alleles differs among loci with the same observed gene diversity but different mutation rates (theta-values). We propose that a reliable test statistic based on allele excess must determine the confidence interval by computer simulations conditional on the observed gene diversity and theta-values. As the latter are notoriously difficult to obtain for experimental data, we suggest that other statistics, such as lnRV, may be better suited to the identification of microsatellite loci subject to selection.
Read full abstract