Abstract

Latin American populations stem from the admixture of Europeans, Africans and Native Americans, which started over 400 years ago and had lasted for several centuries. Extreme deviation over the genome-wide average in ancestry estimations at certain genomic locations could reflect recent natural selection. We evaluated the distribution of ancestry estimations using 678 genome-wide microsatellite markers in 249 individuals from 13 admixed populations across Latin America. We found significant deviations in ancestry estimations including three locations with more than 3.5 times standard deviations from the genome-wide average: an excess of European ancestry at 1p36 and 14q32, and an excess of African ancestry at 6p22. Using simulations, we could show that at least the deviation at 6p22 was unlikely to result from genetic drift alone. By applying different linguistic groups as well as the most likely ancestral Native American populations as the ancestry, we showed that the choice of Native American ancestry could affect the local ancestry estimation. However, the signal at 6p22 consistently appeared in most of the analyses using various ancestral groups. This study provided important insights for recent natural selection in the context of the unique history of the New World and implications for disease mapping.

Highlights

  • Since the particular ancestral populations contributing to the Latin American genomes are not known with certainty, we integrated all the African and European samples as the approximate ‘mean’ ancestral pool of Latin Americans

  • The data we used in this study could not fully represent the populations involved in the admixture about 400 years ago, considering possible genetic drift, natural selection and other demographic events occurred to these populations

  • STRUCTURE was applied to the genome-wide microsatellite data to estimate the ancestry fractions in the Latin Americans (See Materials and Methods)

Read more

Summary

Results

Since the particular ancestral populations contributing to the Latin American genomes are not known with certainty, we integrated all the African and European samples as the approximate ‘mean’ ancestral pool of Latin Americans. The top three signals showing more than 3.5 times standard deviations from the genome-wide average include AAT238 at 1p36 (Z score = 3.76, FDR adjusted P value = 0.047) and GATA51F04 at 14q32 (Z score = 3.63, FDR adjusted P value = 0.047) for an excess of European ancestry, and ATA12D05 at 6p22 (Z score = 4.67, FDR adjusted P value = 0.001) for an excess of African ancestry (Fig. 1) These signals are generally consistent across the 10 replications of admixture analysis with very small variations (see Materials and Methods, and Supplementary Table S1). The heterozygosity at 6p22 is not significantly higher than the genome-wide average compared to other genome-wide markers (excess of heterozygosity = 0.05, Z score = 0.6, P value = 0.27) These findings suggest that the signal at 6p22 inferred from admixture analysis is more likely to be an indication of positive selection, rather than balancing selection. In the microsatellite data used in our study, no considerable increase of LD was observed in this region in any of the ancestry populations (P value > 0.05, Chi-square test for pairwise markers at 6p22)

Discussion
Materials and Methods
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call