Abstract

Progress in human genome research has been made in a number of large international projects, including the HapMap, 1000 Genomes (1KGP), ENCyclopedia of DNA elements (ENCODE) and International Human Epigenome Consortium (IHEC) projects, and the data generated from the projects can be used as reference information for human genome studies. However, more specific reference sets are needed at each population level. While a few studies have been conducted for Korean reference sets with a few reference genomes as well as the chip-based Korean SNP and CNV databases, no Korean-specific variation information is constructed as genome scale. Here, we used Korean exomes to construct Korean variation information. Using read data of 100 Korean exomes obtained Korea National Institution of Health (KNIH), we mapped the exome data of each individual on NCBI GRCh37, merged the mapped information, and extracted information on SNPs and indels. We identified a pool of 1,907,598 SNPs and 325,166 indels as initial variations, masked dbSNP the known variation information against 1KGP variation database, and constructed a database of Korean-specific variations. The database can be utilized as a pilot database of Korean exome variation and contribute to Korean variation study with exome chips or whole genome data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.