Abstract

Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians. Here we report a compilation of coding variants from 1,055 healthy Korean individuals (KOVA; Korean Variant Archive). The samples were sequenced to a mean depth of 75x, yielding 101 singleton variants per individual. Population genetics analysis demonstrates that the Korean population is a distinct ethnic group comparable to other discrete ethnic groups in Africa and Europe, providing a rationale for such independent genomic datasets. Indeed, KOVA conferred 22.8% increased variant filtering power in addition to Exome Aggregation Consortium (ExAC) when used on Korean exomes. Functional assessment of nonsynonymous variant supported the presence of purifying selection in Koreans. Analysis of copy number variants detected 5.2 deletions and 10.3 amplifications per individual with an increased fraction of novel variants among smaller and rarer copy number variable segments. We also report a list of germline variants that are associated with increased tumor susceptibility. This catalog can function as a critical addition to the pre-existing variant databases in pursuing genetic studies of Korean individuals.

Highlights

  • Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians

  • Among the 944 Korean individuals, we found a total of 14,600 putative copy number variation (CNV) segments

  • We have demonstrated the rationale of constructing an independent Korean genomic database by showing that genetic distances between Korean individuals (KOVA) and other East Asian ethnic groups are comparable or even farther than those between the ethnic groups of Africa or Europe (Fig. 1g–j)

Read more

Summary

Introduction

Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians. We report a list of germline variants that are associated with increased tumor susceptibility This catalog can function as a critical addition to the pre-existing variant databases in pursuing genetic studies of Korean individuals. The recent population explosion and a limited purifying selection process during recent human evolutionary history caused an over-accumulation of rare variants of varying functionalities in the human genome, creating limitations in pursuing various disease genetic studies[1, 2] To circumvent such limitations, large-scale databases containing variants from normal healthy populations have been established to provide a ‘healthy genomic profile’, e.g., 1000 Genomes Project (1000GP), Exome Aggregation Consortium (ExAC), and UK10K3–6. Elucidating the genetic similarity and differences among the populations in this region will be an invaluable task[15]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call