Abstract

BackgroundUse of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs). By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required.ResultsWe developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo.ConclusionsParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.

Highlights

  • Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs)

  • Parallel Computation of Haplotype-Based GWAS The results show that the parallel computing ability of ParaHaplo 3.0 for haplotype estimation was 20 times faster than that of the non-parallel version of ParaHaplo 3.0

  • ParaHaplo is based on data parallelism, and our result showed that the computation time of each genotype imputation was approximately proportional to the number of single nucleotide polymorphisms (SNPs) within the linkage disequilibrium (LD) block; we believe that a large LD block may create a computational bottleneck as does in haplotype estimation [6]

Read more

Summary

Introduction

Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs). By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Genome-wide association studies (GWASs) are used to compare the frequency of alleles or genotypes of a particular variant between cases and controls for a particular disease across a given genome [2,3,4]. To quickly conduct GWASs, we developed a software package for the parallel computation of genotype imputation and haplotype reconstruction called ParaHaplo 3.0. ParaHaplo 3.0 is intended for use in workstation clusters using the Intel Message Passing Interface (MPI)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call