Abstract

Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity to Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as putatively impactful variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants, along with the observed and predicted disease-causing mutations in this genome, exemplify some of the global challenges of genome interpretation, especially in the context of under-studied ethnic groups.

Highlights

  • The genetic variation between individuals accounts for much of observed human diversity and has the potential to provide information on phenotypic outcomes of clinical consequence

  • The influence of read mappers was markedly lower; i.e., using the GATK variant caller, we found that 95.1% of single nucleotide variants (SNVs) and 89.3% of indels identified with BWA were identified with Bowtie2, and 98.3% SNVs and of 97.6% of indels identified with Bowtie2 were identified with BWA

  • This work describes the first whole-genome sequencing of a Serbian individual

Read more

Summary

Introduction

The genetic variation between individuals accounts for much of observed human diversity and has the potential to provide information on phenotypic outcomes of clinical consequence. Studies of genetic variation provided by individual genome sequences have revealed that this. Characterization of genetic variation of individuals from multiple populations has revealed a correlation between genetic and geographic distances, and has become relevant for determining genetic ancestry and geographic origin [2,3,4,5,6]. Sequencing of the first human genomes revealed that most genetic variation is derived from single nucleotide variants (SNVs), insertions and deletions (indels) account for the majority of the variant nucleotides [11]. Individual genomes from American [11, 12], Han Chinese [13], Russian [14], Khoisan [15], Bantu [15], Japanese [16], German [17], Gujarati Indian [18], Estonian [19], Pakistani [20] and Mongolian [21] populations have been sequenced and analyzed, among many others [1]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.