Abstract

Population of the State of Kuwait is composed of three genetic subgroups of inferred Persian, Saudi Arabian tribe and Bedouin ancestry. The Saudi Arabian tribe subgroup traces its origin to the Najd region of Saudi Arabia. By sequencing two whole genomes and thirteen exomes from this subgroup at high coverage (>40X), we identify 4,950,724 Single Nucleotide Polymorphisms (SNPs), 515,802 indels and 39,762 structural variations. Of the identified variants, 10,098 (8.3%) exomic SNPs, 139,923 (2.9%) non-exomic SNPs, 5,256 (54.3%) exomic indels, and 374,959 (74.08%) non-exomic indels are ‘novel’. Up to 8,070 (79.9%) of the reported novel biallelic exomic SNPs are seen in low frequency (minor allele frequency <5%). We observe 5,462 known and 1,004 novel potentially deleterious nonsynonymous SNPs. Allele frequencies of common SNPs from the 15 exomes is significantly correlated with those from genotype data of a larger cohort of 48 individuals (Pearson correlation coefficient, 0.91; p <2.2×10−16). A set of 2,485 SNPs show significantly different allele frequencies when compared to populations from other continents. Two notable variants having risk alleles in high frequencies in this subgroup are: a nonsynonymous deleterious SNP (rs2108622 [19:g.15990431C>T] from CYP4F2 gene [MIM:*604426]) associated with warfarin dosage levels [MIM:#122700] required to elicit normal anticoagulant response; and a 3′ UTR SNP (rs6151429 [22:g.51063477T>C]) from ARSA gene [MIM:*607574]) associated with Metachromatic Leukodystrophy [MIM:#250100]. Hemoglobin Riyadh variant (identified for the first time in a Saudi Arabian woman) is observed in the exome data. The mitochondrial haplogroup profiles of the 15 individuals are consistent with the haplogroup diversity seen in Saudi Arabian natives, who are believed to have received substantial gene flow from Africa and eastern provenance. We present the first genome resource imperative for designing future genetic studies in Saudi Arabian tribe subgroup. The full-length genome sequences and the identified variants are available at ftp://dgr.dasmaninstitute.org and http://dgr.dasmaninstitute.org/DGR/gb.html.

Highlights

  • Genetic approaches, like Whole Genome Sequencing (WGS), Exome Sequencing and Genome Wide Association (GWA) Studies, have helped identify causal variants associated with various recessive and complex disorders in many populations [1,2,3,4]

  • This study further provides genetic evidence that the HVS1 (Mitochondrial Hypervariable segment 1) segments of the 15 samples cluster with the observed segments from native Saudi Arabian population [46]; the evidence corroborates our earlier work [42] that 81% of the surnames in Kuwait S group are of Saudi Arabian tribe origin

  • Examination of genetic clusters derived using principal component analysis (PCA) for Kuwait population (Figure S1) reveals that these samples are located deep in the Saudi Arabian tribe cluster

Read more

Summary

Introduction

Like Whole Genome Sequencing (WGS), Exome Sequencing and Genome Wide Association (GWA) Studies, have helped identify causal variants associated with various recessive and complex disorders in many populations [1,2,3,4]. The primary resources required for disease association studies are provided by population-scale projects such as 1000 Genomes Project [14] and International HapMap Project [15]. These efforts have enabled creation of imputation panels [16,17] and detailed catalogues of SNPs, indels and large structural variations [18]. By virtue of considering geographically diverse populations, (for instance, the 1000 Genomes Project that considers 1,092 genomes sampled across 14 populations from Europe, East Asia, sub-Saharan Africa and America) these efforts have identified rare or populationspecific variants in addition to common variants. Apart from whole genome sequencing projects, population-scale exome sequencing projects (such as NHLBI Grand Opportunity Exome Sequencing Project (ESP) [20] that covers diverse and richly-phenotyped populations in United States of America) and large-scale exome sequencing projects conducted in individual countries (China [21], Tibet [22], Denmark [23], and Qatar [24]) have provided a large catalogue of variants and have often been successful in associating rare variants to diseases

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call