Abstract

BackgroundGenomic regions of autozygosity (ROA) arise when an individual is homozygous for haplotypes inherited identical-by-descent from ancestors shared by both parents. Over the past decade, they have gained importance for understanding evolutionary history and the genetic basis of complex diseases and traits. However, methods to infer ROA in dense genotype data have not evolved in step with advances in genome technology that now enable us to rapidly create large high-resolution genotype datasets, limiting our ability to investigate their constituent ROA patterns.MethodsWe report a weighted likelihood approach for inferring ROA in dense genotype data that accounts for autocorrelation among genotyped positions and the possibilities of unobserved mutation and recombination events, and variability in the confidence of individual genotype calls in whole genome sequence (WGS) data.ResultsForward-time genetic simulations under two demographic scenarios that reflect situations where inbreeding and its effect on fitness are of interest suggest this approach is better powered than existing state-of-the-art methods to infer ROA at marker densities consistent with WGS and popular microarray genotyping platforms used in human and non-human studies. Moreover, we present evidence that suggests this approach is able to distinguish ROA arising via consanguinity from ROA arising via endogamy. Using subsets of The 1000 Genomes Project Phase 3 data we show that, relative to WGS, intermediate and long ROA are captured robustly with popular microarray platforms, while detection of short ROA is more variable and improves with marker density. Worldwide ROA patterns inferred from WGS data are found to accord well with those previously reported on the basis of microarray genotype data. Finally, we highlight the potential of this approach to detect genomic regions enriched for autozygosity signals in one group relative to another based upon comparisons of per-individual autozygosity likelihoods instead of inferred ROA frequencies.ConclusionsThis weighted likelihood ROA inference approach can assist population- and disease-geneticists working with a wide variety of data types and species to explore ROA patterns and to identify genomic regions with differential ROA signals among groups, thereby advancing our understanding of evolutionary history and the role of recessive variation in phenotypic variation and disease.

Highlights

  • Genomic regions of autozygosity (ROA) arise when an individual is homozygous for haplotypes inherited identical-by-descent from ancestors shared by both parents

  • It should be noted that these false discovery rates are solely the result of overcalling true ROA and not erroneous ROA calls. This is reflected in the ratios of inferred to true ROA length (Fig. 3) that increase with decreasing Single-nucleotide variant (SNV) density, for smaller ROA, and approach―but never quite reach―one with increasing ROA length. These findings indicate that the weighted LOD-based method (wLOD) method is well powered to detect ROA with high sensitivity and good specificity at a wide range of SNV densities that are consistent with whole genome sequence (WGS) as well as popular microarraybased platforms that are commonly used in human and a b non-human studies of ROA, and in particular long ROA that are of interest in studies of Mendelian and complex diseases and traits

  • Geographic properties of the wLOD method We have shown the wLOD method to be well powered to detect ROA in genetic datasets consistent with WGS and microarray-based genotyping, while our investigation of a Gaussian mixture model approach for ROA classification based upon their genetic map lengths indicates the presence of five ROA classes in The 1000 Genomes Project Phase 3 populations, a higher number than was used in our earlier study of the Human Genome Diversity Panel (HGDP) and International HapMap Project (Phase 3) populations that used a microarrayderived dataset and classified ROA based upon their physical map lengths [18]

Read more

Summary

Introduction

Genomic regions of autozygosity (ROA) arise when an individual is homozygous for haplotypes inherited identical-by-descent from ancestors shared by both parents. Genomic autozygosity levels have been reported to influence a number of complex traits, including height and weight [100,101,102,103], cognitive ability [103,104,105], blood pressure [106,107,108,109,110,111,112,113], and cholesterol levels [113], as well as risk for complex diseases such as cancer [86, 87, 114,115,116,117,118], coronary heart disease [86, 119,120,121], amyotrophic lateral sclerosis (ALS) [122], and mental disorders [123, 124] These observations are consistent with the view that variants with individually small effect sizes associated with complex traits and diseases are more likely to be rare than to be common [125,126,127,128], are more likely to be distributed abundantly rather than sparsely across the genome [9, 129], and are more likely to be recessive than to be dominant [9, 130]. Just as ROA sharing among affected individuals has facilitated our understanding of the genetic basis of monogenic disorders [138] in both inbred [139,140,141,142] and more outbred [143,144,145] families, it represents a potentially powerful approach with which to further our understanding of the genetic etiology of complex disorders [146] of major public health concern worldwide

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call