Abstract

The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method SHAPEIT3 that can handle such biobank scale datasets and results in switch error rates as low as ~0.3%. The method exhibits O(NlogN) scaling in sample size (N), enabling fast and accurate phasing of even larger cohorts.

Highlights

  • Estimation of haplotypes from genotypes, known as phasing, is a central part of the pipeline of many modern genetic analyses

  • Estimated haplotypes are important for many population genetics analyses[1,2], and form a central part of imputation algorithms that are routinely used in genome-wide association studies (GWAS) 3,4

  • We have demonstrated that SHAPEIT3 provides a highly accurate and scalable solution to phasing biobank scale datasets

Read more

Summary

Introduction

Estimation of haplotypes from genotypes, known as phasing, is a central part of the pipeline of many modern genetic analyses. Estimated haplotypes are important for many population genetics analyses[1,2], and form a central part of imputation algorithms that are routinely used in genome-wide association studies (GWAS) 3,4. The ability to phase large data sets is especially important in the context of biobanks that comprise hundreds of thousands of genotyped samples. In May 2015 the UK Biobank (UKB) released genotypes from ~152,000 samples, and this will rise to ~500,000 in 2016. Other Biobanks have already collected large scale genetic datasets[5] or are in the process of doing so[6]. The unprecedented scale of these datasets, and the depth of phenotype information, allows researchers studying

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.