Abstract

BackgroundA central aim for studying runs of homozygosity (ROHs) in genome-wide SNP data is to detect the effects of autozygosity (stretches of the two homologous chromosomes within the same individual that are identical by descent) on phenotypes. However, it is unknown which current ROH detection program, and which set of parameters within a given program, is optimal for differentiating ROHs that are truly autozygous from ROHs that are homozygous at the marker level but vary at unmeasured variants between the markers.MethodWe simulated 120 Mb of sequence data in order to know the true state of autozygosity. We then extracted common variants from this sequence to mimic the properties of SNP platforms and performed ROH analyses using three popular ROH detection programs, PLINK, GERMLINE, and BEAGLE. We varied detection thresholds for each program (e.g., prior probabilities, lengths of ROHs) to understand their effects on detecting known autozygosity.ResultsWithin the optimal thresholds for each program, PLINK outperformed GERMLINE and BEAGLE in detecting autozygosity from distant common ancestors. PLINK's sliding window algorithm worked best when using SNP data pruned for linkage disequilibrium (LD).ConclusionOur results provide both general and specific recommendations for maximizing autozygosity detection in genome-wide SNP data, and should apply equally well to research on whole-genome autozygosity burden or to research on whether specific autozygous regions are predictive using association mapping methods.

Highlights

  • A central aim for studying runs of homozygosity (ROHs) in genome-wide SNP data is to detect the effects of autozygosity on phenotypes

  • As expected, that GERMLINE performed worse than PLINK due to the lower resolution of start/end points of ROHs, we were surprised by the lower performance of BEAGLE; we expected that the incorporation of linkage disequilibrium (LD) information to result in higher accuracy to detect autozygosity

  • Only one study analyzed data that was pruned for LD [19], which we have found to be an important step for improving the accuracy of detecting autozygous ROHs

Read more

Summary

Introduction

A central aim for studying runs of homozygosity (ROHs) in genome-wide SNP data is to detect the effects of autozygosity (stretches of the two homologous chromosomes within the same individual that are identical by descent) on phenotypes. Larger scale studies using genome-wide SNP data have been conducted for complex phenotypes such as Schizophrenia [15], Bipolar Disorder [16], Parkinson’s disease [17], Alzheimer’s disease [18], Colorectal cancer [19], Childhood acute lymphoblastic leukemia [20], and Breast and Prostate cancer [21]. The effects of ROH burden on some complex phenotypes (Schizophrenia, Alzheimer’s disease) were significant, whereas no effects of ROH burden were found on other complex phenotypes (Bipolar Disorder, Colorectal cancer, Childhood acute lymphoblastic leukemia, Breast cancer, and Prostate cancer)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.