Abstract

With the aim to understand how next‐generation sequencing (NGS) improves both our assessment of genetic variation within populations and our knowledge on HLA molecular evolution, we sequenced and analysed 8 HLA loci in a well‐documented population from sub‐Saharan Africa (Mandenka). The results of full‐gene NGS‐MiSeq sequencing compared with those obtained by traditional typing techniques or limited sequencing strategies showed that segregating sites located outside exon 2 are crucial to describe not only class I but also class II population diversity. A comprehensive analysis of exons 2, 3, 4 and 5 nucleotide diversity at the 8 HLA loci revealed remarkable differences among these gene regions, notably a greater variation concentrated in the antigen recognition sites of class I exons 3 and some class II exons 2, likely associated with their peptide‐presentation function, a lower diversity of HLA‐C exon 3, possibly related to its role as a KIR ligand, and a peculiar molecular diversity of HLA‐A exon 2, revealing demographic signals. Based on full‐length HLA sequences, we also propose that the most frequent DRB1 allele in the studied population, DRB1*13:04, emerged from an allelic conversion involving 3 potential alleles as donors and DRB1*11:02:01 as recipient. Finally, our analysis revealed a high occurrence of the DRB1*13:04‐DQA1*05:05:01‐DQB1*03:19 haplotype, possibly resulting from a selective sweep due to protection to Onchorcerca volvulus, a prevalent pathogen in West Africa. This study unveils highly relevant information on the molecular evolution of HLA genes in relation to their immune function, calling for similar analyses in other populations living in contrasting environments.

Highlights

  • Due to the extreme polymorphism of the HLA genomic region[1], the application of next-generation sequencing (NGS) to HLA genes has been challenging in the last decades, requiring careful tests and comparisons among different competing technologies.[2,3,4,5,6,7,8] thanks to tremendous efforts motivated by the need of tissue-typing laboratories to improve the accuracy and throughput of HLA genotyping of potential donors and patients, new HLA sequencing platforms are currently being implemented in most countries, leading to both a wileyonlinelibrary.com/journal/tanHLA. 2018;91:36–51

  • Stimulating results on human populations’ molecular diversity were obtained previously by inferring sequence genotypes to large sets of population samples thanks to the molecular information stored in the IMGT-HLA database,[13] such approaches could only use data defined at the second field level of resolution, ignoring the information provided by synonymous substitutions and by regions located outside exons 2 and 3

  • Our analyses suggest the presence of an extended class II haplotype, DRB1*13:04~ DQA1*05:05:01~DQB1*03:19~DPB1*131:01, as all allelic pairs of this putative haplotype are in significant linkage disequilibrium (LD) (Supplementary Information S04)

Read more

Summary

Introduction

Due to the extreme polymorphism of the HLA genomic region[1] (www.ebi.ac.uk/ipd/imgt/hla/stats.html), the application of next-generation sequencing (NGS) to HLA genes has been challenging in the last decades, requiring careful tests and comparisons among different competing technologies.[2,3,4,5,6,7,8] thanks to tremendous efforts motivated by the need of tissue-typing laboratories to improve the accuracy and throughput of HLA genotyping of potential donors and patients, new HLA sequencing platforms are currently being implemented in most countries, leading to both a wileyonlinelibrary.com/journal/tanHLA. 2018;91:36–51. Besides its potential benefits for histocompatibility, DNA sequencing of the HLA region opens new perspectives in the area of human population genetics by permitting direct analyses of nucleotide variation within a molecular evolutionary genetics framework.[10,11,12] stimulating results on human populations’ molecular diversity were obtained previously by inferring sequence genotypes to large sets of population samples thanks to the molecular information stored in the IMGT-HLA database,[13] such approaches could only use data defined at the second field level of resolution, ignoring the information provided by synonymous substitutions and by regions located outside exons 2 and 3. Several genetic polymorphisms were analysed, among which immunoglobulin markers,[16,17,18] HLA by both serological[16] and Polymerase Chain Reaction-Sequence Specific Oligonucleotide (PCRSSO)[19] methods, mtDNA,[20] genome-wide Restriction Fragment Length Polymorphisms (RFLPs),[21] alpha-22 and beta-globins[23] and N-acetyltransferase 2.24 Based on these different sources of independent information, the Mandenka population (which is since many years a reference population in the HGDP-CEPH Database, www. cephb.fr/hgdp/main.php), is known to exhibit a very high level of genetic diversity, probably as a result of population expansion,[25] and is considered to be representative of a larger population group of Western Africa.[23]

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call