Abstract

Genotype imputation is now routinely performed in genomic analysis. Reference panel size, that is, the number of haplotypes in the reference panel, has been well established to be one major driving factor of imputation accuracy. For that reason, huge efforts have been made worldwide to provide large reference panels, with the Haplotype Reference Consortium (HRC) being currentlythe largest available in the public domain. The imputation performance of HRC, whose major samples are Europeans, has been mainly evaluated in Europeans. We conducted whole-genome genotype imputation on two independent genome-wide genotyping datasets, one with 1000 European samples and the other with 1000 Han Chinese samples. We compared the results obtained using HRC with those using PhaseIII of the 1000 Genomes Project (1000G) reference panel. For the European dataset, using HRC improved imputation quality, especially for rare variants with minor allele-frequency (MAF) < 0.1%. However, 1000G demonstrates better performance in the Han Chinese dataset, in both imputation quality and number of well-imputed variants. We validated the performance of 1000G reference panel in a second, independent cohort of Han Chinese (N = 2402). Our study showcases the limitations of HRC for Han Chinese populations, strongly suggesting the necessity of building population-specific reference panels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call