Abstract

BackgroundCopy number variation is an important class of genomic variation that has been reported in 75% of the human genome. However, it is underreported in African populations. Copy number variants (CNVs) could have important impacts on disease susceptibility and environmental adaptation. To describe CNVs and their possible impacts in Africans, we sequenced genomes of 232 individuals from three major African ethno-linguistic groups: (1) Niger Congo A from Guinea and Côte d’Ivoire, (2) Niger Congo B from Uganda and the Democratic Republic of Congo and (3) Nilo-Saharans from Uganda. We used GenomeSTRiP and cn.MOPS to identify copy number variant regions (CNVRs).ResultsWe detected 7608 CNVRs, of which 2172 were only deletions, 2384 were only insertions and 3052 had both. We detected 224 previously un-described CNVRs. The majority of novel CNVRs were present at low frequency and were not shared between populations. We tested for evidence of selection associated with CNVs and also for population structure. Signatures of selection identified previously, using SNPs from the same populations, were overrepresented in CNVRs. When CNVs were tagged with SNP haplotypes to identify SNPs that could predict the presence of CNVs, we identified haplotypes tagging 3096 CNVRs, 372 CNVRs had SNPs with evidence of selection (iHS > 3) and 222 CNVRs had both. This was more than expected (p < 0.0001) and included loci where CNVs have previously been associated with HIV, Rhesus D and preeclampsia. When integrated with 1000 Genomes CNV data, we replicated their observation of population stratification by continent but no clustering by populations within Africa, despite inclusion of Nilo-Saharans and Niger-Congo populations within our dataset.ConclusionsNovel CNVRs in the current study increase representation of African diversity in the database of genomic variants. Over-representation of CNVRs in SNP signatures of selection and an excess of SNPs that both tag CNVs and are subject to selection show that CNVs may be the actual targets of selection at some loci. However, unlike SNPs, CNVs alone do not resolve African ethno-linguistic groups. Tag haplotypes for CNVs identified may be useful in predicting African CNVs in future studies where only SNP data is available.

Highlights

  • Copy number variation is an important class of genomic variation that has been reported in 75% of the human genome

  • We aimed to discover novel Copy number variants (CNVs) region (CNVR) variants, investigate population differences associated with CNVs and identify SNP haplotypes which tag CNVs and may predict such CNVs in future genome wide association studies (GWAS)

  • We used about 50 samples per population except for 33 from the Ugandan Uganda Bantu Basoga (UBB) population (Table 1). 50 samples provide a 95% chance of discovering CNVRs that have a frequency greater than 7%, while 232 samples give a 95% chance of detecting CNV with greater than 2% frequency

Read more

Summary

Introduction

Copy number variation is an important class of genomic variation that has been reported in 75% of the human genome. While most genomic studies focus on single nucleotide variants (SNV), reports of larger genomic variants such as copy number variants (CNVs) are more limited [2] Given their size, CNVs cover more bases than SNV [2] and may have greater influence on gene expression and structure [3, 4]. The four major ethno-linguistic groups in Africa are the Afro-Asiatic, Nilo-Saharan, Khoisan and Niger Congo, the latter of which consists of two major subdivisions; NigerCongo-A and Niger-Congo-B [9] These populations occupy diverse environments, have different cultures and ancestry and show stratification at genomic level [9]. Studies of genomic variation such as CNVs in Africans may help explain adaptation, population stratification and disease susceptibility

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call