Abstract

Summaries of human genomic variation shed light on human evolution and provide a framework for biomedical research. Variation is often summarised in terms of one or a few statistics (eg FST and gene diversity). Now that multilocus genotypes for hundreds of autosomal loci are available for thousands of individuals, new approaches are applicable. Recently, trees of individuals and other clustering approaches have demonstrated the power of an individual-focused analysis. We propose analysing the distributions of genetic distances between individuals. Each distribution, or common ancestry profile (CAP), is unique to an individual, and does not require a priori assignment of individuals to populations. Here, we consider a range of models of population history and, using coalescent simulation, reveal the potential insights gained from a set of CAPs. Information lies in the shapes of individual profiles -- sometimes captured by variance of individual CAPs -- and the variation across profiles. Analysis of short tandem repeat genotype data for over 1,000 individuals from 52 populations is consistent with dramatic differences in population histories across human groups.

Highlights

  • The collective human gene pool, consisting of the genomes of all living people, has much to reveal regarding human population history

  • Surveys of human genetic variation have been sparse, in that hundreds or thousands of individuals have been studied for a small number of genetic regions (eg blood groups, Human Lymphocyte Antigens (HLA), mitochrondrial DNA, Y chromosome1 – 3) and a few individuals have been studied for a large fraction of the genome

  • In order to facilitate interpretation of common ancestry profile (CAP), we have considered a set of simple models of population history, simulating genetic variation in the context of those models using a coalescent approach

Read more

Summary

Introduction

The collective human gene pool, consisting of the genomes of all living people, has much to reveal regarding human population history. Surveys of human genetic variation have been sparse, in that hundreds or thousands of individuals have been studied for a small number of genetic regions (eg blood groups, Human Lymphocyte Antigens (HLA), mitochrondrial DNA, Y chromosome1 – 3) and a few individuals have been studied for a large fraction of the genome (eg through the Human Genome Project). In the past few years, larger sets of individuals have been studied for hundreds of genetic regions[4] and, concomitantly, new data analysis tools have been developed.[5] With new data and new tools, we are rapidly gaining a more precise understanding of how genetically similar individuals are, and of how that similarity corresponds to other dimensions of human variation. Two DNA sequences chosen at random appear to differ at an average of about one per 1,000– 1,500 nucleotide sites.[7,8,9] This level of diversity corresponds to between 2 and 3.2 million nucleotide differences between individual genomes and is about one order of magnitude lower than the diversity detected within Drosophila (fruitfly) populations.[7]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.