KRGDB: the large-scale variant database of 1722 Koreans based on whole genome sequencing.

Kwang Su Jung,Kyung-Won Hong,Hyun Youn Jo,Jongpill Choi,Hyo-Jeong Ban,Seong Beom Cho,Myungguen Chung

doi:10.1093/database/baz146

Kwang Su Jung, Kyung-Won Hong + Show 5 more

Open Access

https://doi.org/10.1093/database/baz146

Copy DOI

Abstract

Since 2012, the Center for Genome Science of the Korea National Institute of Health (KNIH) has been sequencing complete genomes of 1722 Korean individuals. As a result, more than 32 million variant sites have been identified, and a large proportion of the variant sites have been detected for the first time. In this article, we describe the Korean Reference Genome Database (KRGDB) and its genome browser. The current version of our database contains both single nucleotide and short insertion/deletion variants. The DNA samples were obtained from four different origins and sequenced in different sequencing depths (10× coverage of 63 individuals, 20× coverage of 194 individuals, combined 10× and 20× coverage of 135 individuals, 30× coverage of 230 individuals and 30× coverage of 1100 individuals). The major features of the KRGDB are that it contains information on the Korean genomic variant frequency, frequency difference between the Korean and other populations and the variant functional annotation (such as regulatory elements in ENCODE regions and coding variant functions) of the variant sites. Additionally, we performed the genome-wide association study (GWAS) between Korean genome variant sites for the 30×230 individuals and three major common diseases (diabetes, hypertension and metabolic syndrome). The association results are displayed on our browser. The KRGDB uses the MySQL database and Apache-Tomcat web server adopted with Java Server Page (JSP) and is freely available at http://coda.nih.go.kr/coda/KRGDB/index.jsp.Availability: http://coda.nih.go.kr/coda/KRGDB/index.jsp

Highlights

Advances in sequencing technology permit rapid nucleotide sequencing of large sections of genomes to be achieved at a lower cost than using classical Sanger sequencing methodology [1]
The Center for Genome Science (CGS) initiated the Korean Reference Genome project (KRG) in 2012 and has been conducting whole genome sequencing on a total of 1722 Korean individuals, wherein more than 32 million variants for the Korean population were identified, and a large proportion of the variants were detected for the first time
The Ansan-Ansung cohort is a subset of the cohorts established by the Korean Genome Epidemiology Study (KoGES), in which 8842 individuals of the Ansan-Ansung cohort was previously genotyped by Affymetrix 5.0 SNP array and used in the genome-wide association study (GWAS) [13]

Summary

Introduction

Advances in sequencing technology (next-generation sequencing [NGS]) permit rapid nucleotide sequencing of large sections of genomes to be achieved at a lower cost than using classical Sanger sequencing methodology [1]. The CGS initiated the Korean Reference Genome project (KRG) in 2012 and has been conducting whole genome sequencing on a total of 1722 Korean individuals, wherein more than 32 million variants for the Korean population were identified, and a large proportion of the variants were detected for the first time. We constructed a database and web browser (the Korean Reference Genome Database [KRGDB]) for 27 million single nucleotide variants (SNVs) and 4.9 million short insertion/deletion variants (indels) in the first phase from 622 individuals (2012–2014). In the first phase, testing was performed in a genome-wide association study (GWAS) between Korean genome variant sites for the 30×230 individuals and three major common diseases (diabetes, hypertension and metabolic syndrome). The KRGDB uses MySQL database and Apache-Tomcat web server adapted with Java Server Page (JSP) and is freely available at http://coda.nih.go.kr/coda/KRGDB/ index.jsp

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Database	Publication Date: Jan 1, 2020
Citations: 47	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

KRGDB: the large-scale variant database of 1722 Koreans based on whole genome sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database

Lead the way for us

Similar Papers

Analysis of Complex Disease Association and Linkage Studies Using the University of California Santa Cruz Genome Browser
Tianyuan Wang ... Terrence S Furey
Circulation: Cardiovascular Genetics | VOL. 2
Tianyuan Wang, et. al.Tianyuan Wang ... Terrence S Furey
01 Apr 2009
Circulation: Cardiovascular Genetics | VOL. 2

Human Genetics of Obesity and Type 2 Diabetes Mellitus: Past, Present, and Future.
Erik Ingelsson ... Mark I Mccarthy
Circulation: Genomic and Precision Medicine | VOL. 11
Erik Ingelsson, et. al.Erik Ingelsson ... Mark I Mccarthy
01 Jun 2018
Circulation: Genomic and Precision Medicine | VOL. 11

Adipose Tissue Gene Expression Associations Reveal Hundreds of Candidate Genes for Cardiometabolic Traits
Chelsea K Raulerson ...
American journal of human genetics | VOL. 105
Chelsea K Raulerson, et. al.Chelsea K Raulerson ...
26 Sep 2019
American journal of human genetics | VOL. 105

Integrative analysis of liver-specific non-coding regulatory SNPs associated with the risk of coronary artery disease
Ilakya Selvarajan ...
The American Journal of Human Genetics | VOL. 108
Ilakya Selvarajan, et. al.Ilakya Selvarajan ...
23 Feb 2021
The American Journal of Human Genetics | VOL. 108

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

KRGDB: the large-scale variant database of 1722 Koreans based on whole genome sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database