Abstract
Different human leukocyte antigen (HLA) haplotypes (i.e., the specific combinations of HLA-A, -B, -DR alleles inherited together from one parent) are observed in different frequencies in human populations. Some haplotypes, like HLA-A1-B8, are very frequent, reaching up to 10% in the Caucasian population, while others are very rare. Numerous studies have identified associations between HLA haplotypes and diseases, and differences in haplotype frequencies can in part be explained by these associations: the stronger the association with a severe (autoimmune) disease, the lower the expected HLA haplotype frequency. The peptide repertoires of the HLA molecules composing a haplotype can also influence the frequency of a haplotype. For example, it would seem advantageous to have HLA molecules with non-overlapping binding specificities within a haplotype, as individuals expressing such an haplotype would present a diverse set of peptides from viruses and pathogenic bacteria on the cell surface. To test this hypothesis, we collect the proteome data from a set of common viruses, and estimate the total ligand repertoire of HLA class I haplotypes (HLA-A-B) using in silico predictions. We compare the size of these repertoires to the HLA haplotype frequencies reported in the National Marrow Donor Program (NMDP). We find that in most HLA-A and HLA-B pairs have fairly distinct binding motifs, and that the observed haplotypes do not contain HLA-A and -B molecules with more distinct binding motifs than random HLA-A and HLA-B pairs. In addition, the population frequency of a haplotype is not correlated to the distinctness of its HLA-A and HLA-B peptide binding motifs. These results suggest that there is a not a strong selection pressure on the haplotype level favoring haplotypes having HLA molecules with distinct binding motifs, which would result the largest possible presented peptide repertoires in the context of infectious diseases.
Highlights
The human leukocyte antigen (HLA) genes are the most polymorphic coding loci known in humans
It is widely accepted that this variability is maintained by balancing selection, as individuals that are heterozygous in their HLA class I and II loci seem to have a better outcome in infections diseases [see e.g., for HIV-1 [1]]
Focusing on HLA-A-B haplotypes, the most common haplotypes found in US population are summarized in Table 1
Summary
The human leukocyte antigen (HLA) genes are the most polymorphic coding loci known in humans. The HLA gene cluster is located on the major histocompatibility complex (MHC) on chromosome 6, and contains over 200 genes. The two groups of loci that contain the MHC class I and II genes dictating T cell responses are the most polymorphic. It is widely accepted that this variability is maintained by balancing selection, as individuals that are heterozygous in their HLA class I and II loci seem to have a better outcome in infections diseases [see e.g., for HIV-1 [1]]. We have argued that the heterozygous advantage is on its own not enough to maintain such a large degree of polymorphism, and that the frequency dependent co-evolution with pathogens should play a major role [4]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have