Abstract

The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals—comprising 50 ethnolinguistic groups, including previously unsampled populations—to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon—but in other genes, variants denoted as ‘likely pathogenic’ in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health.

Highlights

  • Advances in genomics have empowered the interrogation of the human genome across global populations[2], with the resulting studies demonstrating that Africa harbours the most genetic variation and diversity[3,4]

  • Our analyses focused on single-nucleotide variants (SNVs) in samples from three African resources: the H3Africa Consortium[8], the Southern African Human Genome Programme (SAHGP)[17] and the Trypanosomiasis Genomics Network of the H3Africa Consortium (TryopanoGEN)[18] (Methods)

  • We deliberately focused on SNVs—which could be confidently inferred—but a similar wealth of diversity and novelty is likely to be found within other variant classes

Read more

Summary

Introduction

Advances in genomics have empowered the interrogation of the human genome across global populations[2], with the resulting studies demonstrating that Africa harbours the most genetic variation and diversity[3,4]. An important mandate of H3Africa is to characterize genetic diversity across Africa to facilitate a framewok for genomic research To this end, we analysed whole-genome sequencing (WGS) data generated in 426 individuals from ongoing H3Africa studies, including 314 high-depth (average depth of coverage, 30×) and 112 medium-depth (average depth of coverage, 10×) whole-genome sequences, encompassing 50 ethnolinguistic groups from 13 countries across Africa (Fig. 1a and Supplementary Methods Table 1). We analysed whole-genome sequencing (WGS) data generated in 426 individuals from ongoing H3Africa studies, including 314 high-depth (average depth of coverage, 30×) and 112 medium-depth (average depth of coverage, 10×) whole-genome sequences, encompassing 50 ethnolinguistic groups from 13 countries across Africa (Fig. 1a and Supplementary Methods Table 1) Some of these groups are studied here for the first time, providing a unique overview of the diverse landscape of African genomic variation. Eastern (UBS) and southern African (Botswana (BOT)) NC speakers clustered with previously studied populations from their respective geographical regions

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call