Abstract

An enlarged mtDNA database ( n=549) for the Portuguese population, comprising HVRI and HVRII regions is reported. This database was used to test the effect of sample size on the estimation of relevant parameters such as haplotype diversity, number of different haplotypes, nucleotide diversity and number of polymorphic positions. Simulations were performed generating sets of random subsamples of variable sizes ( n=50, 100, 200, 300 and 400). The results show that while haplotype and nucleotide diversities do not vary significantly with sample size, the numbers of haplotypes and polymorphic positions rise continuously inside the tested interval. These trends are interpretable by the evolution of the proportions of sequences that are found once or twice, which drop dramatically as sample size increases, with the corresponding rise in the frequency of those encountered 3 times or more. The generated data were also used to extrapolate saturation curves for the referred parameters. When considering for instance the number of haplotypes, it is shown that a sample size of 1,000 individuals is required for practical saturation (defined as the point where a sample size increase of 100 individuals corresponds to an increment in the diversity measure below 5%). For HVRII the same level is reached at n=900 and n=1,300 is needed when both regions are analysed simultaneously. Consequently, we can infer that currently used sample sizes are still rather inadequate for both anthropological and forensic purposes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call