Abstract

Since 2012, the Center for Genome Science of the Korea National Institute of Health (KNIH) has been sequencing complete genomes of 1722 Korean individuals. As a result, more than 32 million variant sites have been identified, and a large proportion of the variant sites have been detected for the first time. In this article, we describe the Korean Reference Genome Database (KRGDB) and its genome browser. The current version of our database contains both single nucleotide and short insertion/deletion variants. The DNA samples were obtained from four different origins and sequenced in different sequencing depths (10× coverage of 63 individuals, 20× coverage of 194 individuals, combined 10× and 20× coverage of 135 individuals, 30× coverage of 230 individuals and 30× coverage of 1100 individuals). The major features of the KRGDB are that it contains information on the Korean genomic variant frequency, frequency difference between the Korean and other populations and the variant functional annotation (such as regulatory elements in ENCODE regions and coding variant functions) of the variant sites. Additionally, we performed the genome-wide association study (GWAS) between Korean genome variant sites for the 30×230 individuals and three major common diseases (diabetes, hypertension and metabolic syndrome). The association results are displayed on our browser. The KRGDB uses the MySQL database and Apache-Tomcat web server adopted with Java Server Page (JSP) and is freely available at http://coda.nih.go.kr/coda/KRGDB/index.jsp.Availability: http://coda.nih.go.kr/coda/KRGDB/index.jsp

Highlights

  • Advances in sequencing technology permit rapid nucleotide sequencing of large sections of genomes to be achieved at a lower cost than using classical Sanger sequencing methodology [1]

  • The Center for Genome Science (CGS) initiated the Korean Reference Genome project (KRG) in 2012 and has been conducting whole genome sequencing on a total of 1722 Korean individuals, wherein more than 32 million variants for the Korean population were identified, and a large proportion of the variants were detected for the first time

  • The Ansan-Ansung cohort is a subset of the cohorts established by the Korean Genome Epidemiology Study (KoGES), in which 8842 individuals of the Ansan-Ansung cohort was previously genotyped by Affymetrix 5.0 SNP array and used in the genome-wide association study (GWAS) [13]

Read more

Summary

Introduction

Advances in sequencing technology (next-generation sequencing [NGS]) permit rapid nucleotide sequencing of large sections of genomes to be achieved at a lower cost than using classical Sanger sequencing methodology [1]. The CGS initiated the Korean Reference Genome project (KRG) in 2012 and has been conducting whole genome sequencing on a total of 1722 Korean individuals, wherein more than 32 million variants for the Korean population were identified, and a large proportion of the variants were detected for the first time. We constructed a database and web browser (the Korean Reference Genome Database [KRGDB]) for 27 million single nucleotide variants (SNVs) and 4.9 million short insertion/deletion variants (indels) in the first phase from 622 individuals (2012–2014). In the first phase, testing was performed in a genome-wide association study (GWAS) between Korean genome variant sites for the 30×230 individuals and three major common diseases (diabetes, hypertension and metabolic syndrome). The KRGDB uses MySQL database and Apache-Tomcat web server adapted with Java Server Page (JSP) and is freely available at http://coda.nih.go.kr/coda/KRGDB/ index.jsp

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.