Abstract

The ability to infer personal genetic ancestry is being increasingly utilised in certain medical and forensic situations. Herein, the unsupervised Bayesian clustering algorithms structure, is employed to analyse 377 autosomal short tandem repeats typed on 1,056 individuals from the Centre d'Etude du Polymorphisme Humain Human Diversity Panel. Individuals of known geographical origin were hierarchically classified into a framework of increasingly homogeneous clusters to serve as reference populations into which individuals of unknown ancestry can be assigned. The groupings were characterised by the geographical affinities of cluster members and the accuracy of these procedures was verified using several genetic indices. Fine-scale substructure was detectable beyond the broad population level classifications that previously have been explored in this dataset. Metrics indicated that within certain lines, the strongest structuring signals were detected at the leaves of the hierarchy where lineage-specific groupings were identified. The accuracy of unknown assignment was assessed at each level of the hierarchy using a 'leave one out' strategy in which each individual was stripped of cluster membership and then re-assigned using the supervised Bayesian clustering algorithm implemented in GeneClass2. Although most clusters at all levels of resolution experienced highly accurate assignment, a decline was observed in the finer levels due to the mixed membership characteristics of some individuals. The parameters defined by this study allowed for assignment of unknown individuals to genetically defined clusters with measured likelihood. Shared ancestry data can then be inferred for the unknown individual.

Highlights

  • Hypervariable microsatellite markers, situated across the autosomes, have been shown to produce stronger resolution for high-level differentiation of populations when compared with biallelic markers.[1]

  • Using the Centre d’Etude du Polymorphisme Humain (CEPH) Human Diversity Panel dataset, we describe and validate its decomposition into fine-scale resolution reference populations to which unknown individuals can be assigned with measured likelihood, revealing relevant ancestral information on a more recent time scale for the unknown individual

  • With the entire hierarchical structure definition as a parameter, unknown individuals have the potential to be assigned to highly resolved cluster definitions that represent specific localities and likely family groups

Read more

Summary

Introduction

Hypervariable microsatellite markers, situated across the autosomes, have been shown to produce stronger resolution for high-level differentiation of populations when compared with biallelic markers.[1]. It has been demonstrated that homoplastic mutations increase the likelihood of common identical by state alleles among unrelated individuals, thereby reducing variance for the individual, adjusted estimates for within-population variance (0.812– 0.854)[8] still exceed between-population variance This indicates an overall similarity between populations — as defined by current geopolitical or other proxy designations — and strong variance within such populations. A systematic hierarchical analysis of the genetic composition of subpopulations would allow groups that have strong genetic homogeneity to be identified and reveal relationships that are probably due to extended familial ties. These relationships may persist across geopolitical borders, but are expected to q HENRY STEWART PUBLICATIONS 1473–9542.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.