Genotypic discrepancies arising from imputation.

Anthony L Hinrichs,Robert C Culverhouse,Brian K Suarez

doi:10.1186/1753-6561-8-s1-s17

Anthony L Hinrichs, Robert C Culverhouse + Show 1 more

Open Access

https://doi.org/10.1186/1753-6561-8-s1-s17

Copy DOI

Journal: BMC Proceedings	Publication Date: Jun 1, 2014
Citations: 11	License type: cc-by

Affiliation: Washington University in St. Louis

Abstract

The ideal genetic analysis of family data would include whole genome sequence on all family members. A strategy of combining sequence data from a subset of key individuals with inexpensive, genome-wide association study (GWAS) chip genotypes on all individuals to infer sequence level genotypes throughout the families has been suggested as a highly accurate alternative. This strategy was followed by the Genetic Analysis Workshop 18 data providers. We examined the quality of the imputation to identify potential consequences of this strategy by comparing discrepancies between GWAS genotype calls and imputed calls for the same variants. Overall, the inference and imputation process worked very well. However, we find that discrepancies occurred at an increased rate when imputation was used to infer missing data in sequenced individuals. Although this may be an artifact of this particular instantiation of these analytic methods, there may be general genetic or algorithmic reasons to avoid trying to fill in missing sequence data. This is especially true given the risk of false positives and reduction in power for family-based transmission tests when founders are incorrectly imputed as heterozygotes. Finally, we note a higher rate of discrepancies when unsequenced individuals are inferred using sequenced individuals from other pedigrees drawn from the same admixed population.

Highlights

The ideal genetic analysis of family data would include whole genome sequence data on all family members
Generation of the data by the Genetic Analysis Workshop 18 (GAW18) providers We will distinguish between two ways that missing data were “filled in” in the GAW18 data: filling in missing sequence data in the sequenced individuals will be referred to as “imputation,” and inferring sequence-level data for individuals who were only genotyped using a genome-wide association study (GWAS) chip will be referred to as “inference.” We understand the imputation and inference process followed by the GAW18 data providers to consist of the following steps: (a) the GWAS chip data were phased using MaCH [4], and a haplotype scaffolding for the families was created; (b) missing sequence data in the sequenced individuals were imputed using MaCH; (c) sequence haplotypes for the unsequenced individuals were
We first present the results for the high call rate single-nucleotide polymorphisms (SNPs) alone and compare these with the results found in the full comparison SNPs set

Summary

Introduction

The ideal genetic analysis of family data would include whole genome sequence data on all family members. A procedure has been suggested to avoid having to sequence every individual [1]. This procedure uses dense sequence data on a subset of individuals and sparse, inexpensive, genome-wide association study (GWAS) chip genotypes on all individuals to infer sequence-level genotypes on the related, unsequenced individuals. The Genetic Analysis Workshop 18 (GAW18) data providers have followed these procedures as documented in [2]. We examine the quality of the imputation to identify potential consequences for this approach

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Genotypic discrepancies arising from imputation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings

Lead the way for us

Similar Papers

2008 Presidential Address: Principia Genetica: Our Future Science
Aravinda Chakravarti
The American Journal of Human Genetics | VOL. 86
Aravinda ChakravartiAravinda Chakravarti
01 Mar 2010
2008 Presidential Address: Principia Genetica: Our Future Science
Aravinda Chakravarti

A bioinformatic analysis of Mycobacterium tuberculosis and host genomic data

-

09 Jan 2018
09 Jan 2018

Omics approaches to discover pathophysiological pathways contributing to human pain.
Luda Diatchenko ... Sahel Jahangiri Esfahani
Pain | VOL. 163
Luda Diatchenko, et. al.Luda Diatchenko ... Sahel Jahangiri Esfahani
01 Jul 2022
Pain | VOL. 163

Genetics and biology of asthma 2010: La' ci darem la mano…
Donata Vercelli
The Journal of Allergy and Clinical Immunology | VOL. 125
Donata VercelliDonata Vercelli
01 Feb 2010
Genetics and biology of asthma 2010: La' ci darem la mano…
Donata Vercelli

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genotypic discrepancies arising from imputation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings