Abstract

BackgroundCopy number variations (CNVs) account for a substantial proportion of inter-individual genomic variation. However, a majority of genomic variation studies have focused on single-nucleotide variations (SNVs), with limited genome-wide analysis of CNVs in large cohorts, especially in populations that are under-represented in genetic studies including people of African descent.MethodsWe carried out a genome-wide copy number analysis in > 3400 healthy Bantu Africans from Tanzania. Signal intensity data from high density (> 2.5 million probes) genotyping arrays were used for CNV calling with three algorithms including PennCNV, DNAcopy and VanillaICE. Stringent quality metrics and filtering criteria were applied to obtain high confidence CNVs.ResultsWe identified over 400,000 CNVs larger than 1 kilobase (kb), for an average of 120 CNVs (SE = 2.57) per individual. We detected 866 large CNVs (≥ 300 kb), some of which overlapped genomic regions previously associated with multiple congenital anomaly syndromes, including Prader-Willi/Angelman syndrome (Type1) and 22q11.2 deletion syndrome. Furthermore, several of the common CNVs seen in our cohort (≥ 5%) overlap genes previously associated with developmental disorders.ConclusionsThese findings may help refine the phenotypic outcomes and penetrance of variations affecting genes and genomic regions previously implicated in diseases. Our study provides one of the largest datasets of CNVs from individuals of African ancestry, enabling improved clinical evaluation and disease association of CNVs observed in research and clinical studies in African populations.

Highlights

  • Copy number variations (CNVs) account for a substantial proportion of inter-individual genomic varia‐ tion

  • CNVs in regions associated with disease To assess which CNVs from our cohort overlap genes associated with developmental disorders, we identified overlap of our common (≥ 5%), low frequency (≥ 1—< 5%), and rare (≥ 0.1—< 1%) CNV blocks with genes catalogued in the Developmental Disorders Genotype–Phenotype Database (Additional file 5) (DDG2P, [48]), compiled based on known implication in disease etiology

  • CNV detection and analysis We identified 448,337 CNVs in the genomes of 3463 Bantu African children (Fig. 1)

Read more

Summary

Introduction

Copy number variations (CNVs) account for a substantial proportion of inter-individual genomic varia‐ tion. CNVs may play a role in the etiology of common, complex diseases and traits including, diabetes, asthma, HIV susceptibility, cancer, and phenotypes in immune and environmental responses [13,14,15,16,17] In addition to their role in disease, CNVs account for a high level of variation between healthy individuals, both within and between populations [1,2,3, 18, 19]. The vast majority of CNV data derive from individuals of European descent residing in Western countries, which might cause incorrect clinical interpretation of genomic variants [26,27,28] Resources such as the Genome Aggregation Database (gnomAD) have reported structural variations, including CNVs, in large cohorts of individuals of both European and nonEuropean ancestries [29]. Regardless, knowledge of the genomic landscape of CNVs remains incomplete, especially in understudied populations such as Africans

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call