Short tandem repeat (STR) polymorphisms are traditionally assessed by measuring allele lengths via capillary electrophoresis (CE). Massively parallel sequencing (MPS) reveals differences among alleles of the same length, thus improving discrimination, but also identifying groups of alleles likely related by descent. These may have relatively restricted geographical distributions and thus MPS could detect population structure more effectively than CE-based analysis. We addressed this question by applying an MPS multiplex, the Promega PowerSeq™ Auto/Mito/Y System prototype, to 362 individuals chosen to represent a wide geographical spread from the People of the British Isles (PoBI) cohort, which represents at least three generations of local rural ancestry. As well as 22 autosomal STRs (aSTRs; equivalent to PowerPlex Fusion loci) the system sequences 23 Y-STRs (the PowerPlexY 23 loci) and the control region (CR) of mitochondrial DNA (mtDNA), allowing population structure to be compared across biparentally and uniparentally inherited segments of the genome. For all loci, FST-based tests of population structure were done based on historical, linguistic, and geographical partitions, and for aSTRs the clustering algorithm STRUCTURE was also applied. STRs were considered using both length and sequence. Sequencing increased aSTR allele diversity by 87.5% compared to CE-based designations, reducing random match probability to 1.25E-30, compared to a CE-based 6.72E-27. Significant population structure was detectable in just one pairwise comparison (Central/South East England compared to the rest), and for sequence-based alleles only. The 362 samples carried 308 distinct mtDNA CR haplotypes corresponding to 13 broad haplogroups, representing a haplotype diversity of 0.9985 ( ± 0.0005), and a haplotype match probability of 0.0043. No significant population structure was observed. Y-STR haplotypes belonged to ten broad predicted Y-haplogroups. Allele diversity increased by 33% when considered at the sequence rather than length level, although haplotype diversity was unchanged at 0.999969 ( ± 0.000001); haplotype match probability was 2.79E-03. In contrast to the biparentally and maternally inherited loci, Y-STR haplotypes showed significant population structure at several levels, but most markedly in a comparison of regions subject to Anglo-Saxon influence in the east with the rest of the sample. This was evident for both length- and sequence-based allele designations, with no systematic difference between the two. We conclude that MPS analysis of aSTRs or Y-STRs does not generally reveal stronger population structure than length-based analysis, that UK maternal lineages are not significantly structured, and that Y-STR haplotypes reveal significant population structure that may reflect the Anglo-Saxon migrations to Britain in the 6th century.
Read full abstract