Genotype imputation accuracy with different reference panels in admixed populations.

Guan-Hua Huang,Yi-Chi Tseng

doi:10.1186/1753-6561-8-s1-s64

Abstract

Genome-wide association studies have successfully identified common variants that are associated with complex diseases. However, the majority of genetic variants contributing to disease susceptibility are yet to be discovered. It is now widely believed that multiple rare variants are likely to be associated with complex diseases. Using custom-made chips or next-generation sequencing to uncover the effects of rare variants on the disease can be very expensive in current technology. Consequently, many researchers use the genotype imputation approach to predict the genotypes at these rare variants that are not directly genotyped in the study sample. One important question in genotype imputation is how to choose a reference panel that will produce high imputation accuracy in a population of interest. Using whole genome sequence data from the Genetic Analysis Workshop 18 data set, this report compares genotype imputation accuracy among reference panels representing different degrees of genetic similarity to a study sample of admixed Mexican Americans. Results show that a reference panel that closely matches the ancestry of the study population can increase imputation accuracy, but it can also result in more missing genotype calls. Having a larger-size reference panel can reduce imputation error and missing genotype, but the improvement may be limited. We also find that, for the admixed study sample, the simple selection of a single best-reference panel among HapMap African, European, or Asian population is not appropriate. The composite reference panel combining all available reference data should be used.

Highlights

Large-scale genome-wide association studies (GWAS) based on common variants genotyping have only identified a small fraction of the heritable variation of complex diseases
Discordance and missing rates are calculated based on the 773,165 singlenucleotide polymorphisms (SNPs) that are present in both 1000 Genomes phase 1 and whole genome sequence (WGS) data, but not present in the GWAS data
Genetic Analysis Workshop 18 (GAW18)-WGS can have higher missing genotype rates than 1000 Genomes references for most thresholds. These results may indicate that a reference panel that closely matches the ancestry of the study population can increase imputation accuracy, but this can risk losing diversity and make it harder to identify haplotype sharing with simple models, thereby resulting more missing genotype calls

Summary

Introduction

Large-scale genome-wide association studies (GWAS) based on common variants (a minor allele frequency [MAF]≥5%) genotyping have only identified a small fraction of the heritable variation of complex diseases. Many researchers use the genotype imputation approach to predict the genotypes at these rare variants that are not directly genotyped in the study sample [3] These predicted genotypes can be Imputation methods work by combining a reference panel of individuals genotyped at a dense set of singlenucleotide polymorphisms (SNPs) with a study sample genotyped at a subset of these sites [4]. One might only include the individuals who most closely match the ancestry of the study population as the reference panel [7] This “best match” strategy reduces the computational burden of imputation, but it can yield suboptimal accuracy with using partial information of diverse reference collections, or in studies with no clear reference matches (e.g., admixed populations) [6]. Several studies [6,8] have compared and discussed various choices of reference panels

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Proceedings	Publication Date: Jun 1, 2014
Citations: 22	License type: cc-by

R Discovery Prime

R Discovery Prime

Genotype imputation accuracy with different reference panels in admixed populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings

Lead the way for us

Similar Papers

Author response: Rapid, Reference-Free human genotype imputation with denoising autoencoders
Raquel Dias ... Shang-Fu Chen
-
Raquel Dias, et. al.Raquel Dias ... Shang-Fu Chen
23 Feb 2022
23 Feb 2022

DASH: A Method for Identical-by-Descent Haplotype Mapping Uncovers Association with Recent Variation
Alexander Gusev ... Itsik Pe'Er
The American Journal of Human Genetics | VOL. 88
Alexander Gusev, et. al.Alexander Gusev ... Itsik Pe'Er
27 May 2011
The American Journal of Human Genetics | VOL. 88

Extending Rare-Variant Testing Strategies: Analysis of Noncoding Sequence and Imputed Genotypes
Matthew Zawistowski ... Sebastian Zöllner
The American Journal of Human Genetics | VOL. 87
Matthew Zawistowski, et. al.Matthew Zawistowski ... Sebastian Zöllner
01 Nov 2010
The American Journal of Human Genetics | VOL. 87

This month in The Journal
Kylee L Spencer ... Sara B Cullinan
The American Journal of Human Genetics | VOL. 109
Kylee L Spencer, et. al.Kylee L Spencer ... Sara B Cullinan
01 Jun 2022
The American Journal of Human Genetics | VOL. 109

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genotype imputation accuracy with different reference panels in admixed populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings