The fixation index, F IS, has been a staple measure to detect selection, or departures from random mating in populations. However, current Next Generation Sequencing (NGS) cannot easily estimate F IS, in multi-locus gene families that contain multiple loci having similar or identical arrays of variant sequences of ≥1 kilobase (kb), which differ at multiple positions. In these families, high-quality short-read NGS data typically identify variants, but not the genomic location, which is required to calculate F IS (based on locus-specific observed and expected heterozygosity). Thus, to assess assortative mating, or selection on heterozygotes, from NGS of multi-locus gene families, we need a method that does not require knowledge of which variants are alleles at which locus in the genome. We developed such a method. Like F IS, our novel measure, 1 H IS, is based on the principle that positive assortative mating, or selection against heterozygotes, and some other processes reduce within-individual variability relative to the population. We demonstrate high accuracy of 1 H IS on a wide range of simulated scenarios and two datasets from natural populations of penguins and dolphins. 1 H IS is important because multi-locus gene families are often involved in assortative mating or selection on heterozygotes. 1 H IS is particularly useful for multi-locus gene families, such as toll-like receptors, the major histocompatibility complex in animals, homeobox genes in fungi and self-incompatibility genes in plants.
Read full abstract