Abstract

Over the past several decades polymorphic genetic loci have been discussed for their utility in human phylogenetic inferences. Short Tandem Repeat (STR) loci have shown promising results for this purpose. Unfortunately, allele frequency data of polymorphic loci are largely confined to few populations. Therefore, the number of shared loci declines as the number of population increases. We hypothesize that even a smaller number of STR loci can be used efficiently for phylogenetic purposes if an appropriate theoretical and statistical strategy is employed. This strategy provides a feasible and cost effective method to choose appropriate STR loci for phylogenetic studies. For this purpose, an empirical study was conducted using allele frequency data of three STR loci CSF1PO, TPOX, and TH01 across 98 human populations from the literature (references are available at http://dnaa.bravehost.com/ index.html and http://www.cstl.nist.gov/strbase/population/Omnipop). The choice of markers was based on locus polymorphism, high heterozygosity, low mutation rate, less artifacts and independence between the loci. Three methods were used to measure genetic distances between the populations; Cavalli Sforza’s chord distance (DC), Nei’s genetic (DA) and Nei’s standard genetic distances (DST). Coefficient of variation (CV) was calculated across hundred (100) datasets obtained by re-sampling of the original dataset for each of the genetic distance methods. CV was in order of DST >DA >DC. Therefore, a consensus tree based on DC was constructed using Neighbour Joining (NJ), Unweighted Pair Group Method with Arithmatic mean (UPGMA) and Maximum Likelihood (ML) methods. NJ and UPGMA methods got more statistical support that is higher bootstrap values than ML (NJ> UPGMA> ML). Validation study was performed using (A) Principal Component Analysis (B) Comparison with trees reported for other molecular markers (C) STR genotyping of five Pakistani subpopulations. Results strongly supported our hypothesis that the three STR markers CSF1PO, TPOX, and TH01 are successful in delineating ethnic, geographic and linguistic differentiation between the populations.

Highlights

  • Phylogenetic inferences are premised on the inheritance of ancestral characteristics and on the existence of an evolutionary history defined by changes in these characteristics (Li, Pearl, & Doss, 2000)

  • An empirical study was conducted using allele frequency data of three Short Tandem Repeat (STR) loci CSF1PO, TPOX, and TH01 across 98 human populations from the literature

  • Validation study was performed using (A) Principal Component Analysis (B) Comparison with trees reported for other molecular markers (C) STR genotyping of five Pakistani subpopulations

Read more

Summary

Introduction

Phylogenetic inferences are premised on the inheritance of ancestral characteristics and on the existence of an evolutionary history defined by changes in these characteristics (Li, Pearl, & Doss, 2000). Polymorphic loci for which the allele frequency data are available are largely confined to European, North American and East Asian populations (Nei & Roychoudhury, 1993). For this reason the number of shared loci declines as the number of population increases. In the present study we hypothesize that the minimum number of markers can perform efficiently for phylogenetic inferences if the theoretical and statistical strategy applied is correct. Statistical analyses performed on the datasets of empirical and validation studies are explained in section ‘Theory and Calculations’

Choosing the STR Loci
World Population Data
STR Polymorphism
Genetic Distance Measure
Construction of Phylogenetic Trees
Consensus Tree
Comparison with other Phylogenetic Trees
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.