To improve imputation quality for genome-wide association studies (GWAS) conducted on the Japanese population, we developed and evaluated four Japanese population-specific reference panels. These panels were constructed through the augmentation of the 1000 Genomes Project (1KG) panel using Japanese whole genome sequencing (WGS) data, with sample sizes ranging from 1 K to 7 K individuals enrolled through the Biobank Japan (BBJ) project, and sequencing depths ranging from 3× to 30×. Among these panels, an augmented reference panel comprising 7472 WGS samples of mixed depth (1KG+7K) exhibit the greatest improvement in imputation quality relative to the Trans-Omics for Precision Medicine (TOPMed) reference panel. Notably, we observe these improvements primarily for rare variants with a minor allele frequency (MAF) <5%. To demonstrate the benefits of improved imputation quality in association analyses of complex traits, we conducted GWAS for serum uric acid and total cholesterol levels following imputation up to the 1KG+7K panel. The analysis reveals several loci reaching genome-wide significance (P < 5 × 10–8) in the 1KG+7K imputation output yet remaining undetected when the same sample set is imputed up to the TOPMed reference panel. In summary, the 1KG+7K panel demonstrates significant advantages in the discovery of trait-associated loci, particularly those influenced by low-frequency association signals.
Read full abstract