Abstract

Ethnic-specific SNP arrays are becoming more important to increase the power of genome-wide association studies in diverse population. In the Tohoku Medical Megabank Project, we have been developing a series of Japonica Arrays (JPA) for genotyping participants based on reference panels constructed from whole-genome sequence data of the Japanese population. Here, we designed a novel version of the SNP array for the Japanese population, called Japonica Array NEO (JPA NEO), comprising a total of 666,883 markers. Among them, 654,246 tag SNPs of autosomes and X chromosome were selected from an expanded reference panel of 3,552 Japanese, 3.5KJPNv2, using pairwise r2 of linkage disequilibrium measures. Additionally, 28,298 markers were included for the evaluation of previously identified disease risk markers from the literature and databases, and those present in the Japanese population were extracted using the reference panel. Through genotyping 286 Japanese samples, we found that the imputation quality r2 and INFO score in the minor allele frequency bin >2.5–5% were >0.9 and >0.8, respectively, and >12 million markers were imputed with an INFO score >0.8. From these results, JPA NEO is a promising tool for genotyping the Japanese population with genome-wide coverage, contributing to the development of genetic risk scores.

Highlights

  • Increasing the power of genome-wide association studies in diverse populations is important for understanding the genetic determinants of disease risks, and large-scale genotype data are collected by genome cohort and biobank projects all over the world

  • Japonica Array NEO is a promising tool for genotyping the Japanese population with genome-wide coverage, contributing to the development of genetic risk scores for this population and further identifying disease risk alleles among individuals of East Asian ancestry

  • Tag single nucleotide polymorphism (SNP) selection for improved genome-wide coverage In Japonica Array NEO (JPA NEO), our updated version of the Japonica Array, we used the maximum number on a single array of the Axiom 96-array layout, and the total of nearly 670,000 markers were divided into about 650,000 tag SNPs and tens of thousands of disease-related markers

Read more

Summary

Introduction

Increasing the power of genome-wide association studies in diverse populations is important for understanding the genetic determinants of disease risks, and large-scale genotype data are collected by genome cohort and biobank projects all over the world. Taking advantage of the two abovementioned cohorts, we planned a strategy for genomic analysis as follows: development of a whole-genome reference panel using the TMM CommCohort, large-scale genotyping and genotype imputation of both cohorts, and collection of accurate haplotype information from the TMM BirThree Cohort. Based on this strategy, we first established an allele frequency panel called 1KJPN, which includes the whole-genome sequencing (WGS) data of 1,070 participants [5]. Based on the updated reference panel, we have developed and refined custom single nucleotide polymorphism (SNP) arrays for genotyping all 157,602 participants, as described below

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call