Abstract

BackgroundImputation has become a standard approach in genome-wide association studies (GWAS) to infer in silico untyped markers. Although feasibility for common variants imputation is well established, we aimed to assess rare and ultra-rare variants’ imputation in an admixed Caribbean Hispanic population (CH).MethodsWe evaluated imputation accuracy in CH (N = 1,000), focusing on rare (0.1% ≤ minor allele frequency (MAF) ≤ 1%) and ultra-rare (MAF < 0.1%) variants. We used two reference panels, the Haplotype Reference Consortium (HRC; N = 27,165) and 1000 Genome Project (1000G phase 3; N = 2,504) and multiple phasing (SHAPEIT, Eagle2) and imputation algorithms (IMPUTE2, MACH-Admix). To assess imputation quality, we reported: (a) high-quality variant counts according to imputation tools’ internal indexes (e.g., IMPUTE2 “Info” ≥ 80%). (b) Wilcoxon Signed-Rank Test comparing imputation quality for genotyped variants that were masked and imputed; (c) Cohen’s kappa coefficient to test agreement between imputed and whole-exome sequencing (WES) variants; (d) imputation of G206A mutation in the PSEN1 (ultra-rare in the general population an more frequent in CH) followed by confirmation genotyping. We also tested ancestry proportion (European, African and Native American) against WES-imputation mismatches in a Poisson regression fashion.ResultsSHAPEIT2 retrieved higher percentage of imputed high-quality variants than Eagle2 (rare: 51.02% vs. 48.60%; ultra-rare 0.66% vs. 0.65%, Wilcoxon p-value < 0.001). SHAPEIT-IMPUTE2 employing HRC outperformed 1000G (64.50% vs. 59.17%; 1.69% vs. 0.75% for high-quality rare and ultra-rare variants, respectively, Wilcoxon p-value < 0.001). SHAPEIT-IMPUTE2 outperformed MaCH-Admix. Compared to 1000G, HRC-imputation retrieved a higher number of high-quality rare and ultra-rare variants, despite showing lower agreement between imputed and WES variants (e.g., rare: 98.86% for HRC vs. 99.02% for 1000G). High Kappa (K = 0.99) was observed for both reference panels. Twelve G206A mutation carriers were imputed and all validated by confirmation genotyping. African ancestry was associated with higher imputation errors for uncommon and rare variants (p-value < 1e-05).ConclusionReference panels with larger numbers of haplotypes can improve imputation quality for rare and ultra-rare variants in admixed populations such as CH. Ethnic composition is an important predictor of imputation accuracy, with higher African ancestry associated with poorer imputation accuracy.

Highlights

  • Genome-wide association studies (GWASs) are a major tool to identify common variants associated with complex diseases

  • We selected randomly 1,000 Caribbean Hispanics as part of an original genotyped cohort of 3,138 individuals: genotyped data can be downloaded at dbGaP Study Accession: phs000496.v1.p1. 719 individuals were derived from Estudio Familiar Investigar Genetica de Alzheimer (EFIGA), a study of familial late-onset Alzheimer’s disease (LOAD); and 281 individuals from the multiethnic longitudinal cohort, Washington Heights, Inwood, Columbia Aging Project (WHICAP)

  • We found SHAPEIT2 better than Eagle2 when evaluated based on mean R2 and “Info” metric using either the reference panels

Read more

Summary

Introduction

Genome-wide association studies (GWASs) are a major tool to identify common variants associated with complex diseases. Microarray SNP chips for GWAS are optimally designed to uncover common variants, often associated with small effect sizes mostly located in intronic and intergenic regions. Imputation of rare variants has become an important topic to enhance the genome coverage in GWAS. Imputation is a process of inferring untyped SNP markers in the discovery population by using densely typed SNPs in external reference panel(s). These ‘in silico’ markers increase the coverage of association tests while conducting genome-wide association analysis. Imputation has become a standard approach in genome-wide association studies (GWAS) to infer in silico untyped markers. Feasibility for common variants imputation is well established, we aimed to assess rare and ultra-rare variants’ imputation in an admixed Caribbean Hispanic population (CH)

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call