Abstract
The major histocompatibility complex (MHC) is recognised as one of the most important genetic regions in relation to common human disease. Advancement in identification of MHC genes that confer susceptibility to disease requires greater knowledge of sequence variation across the complex. Highly duplicated and polymorphic regions of the human genome such as the MHC are, however, somewhat refractory to some whole-genome analysis methods. To address this issue, we are employing a bacterial artificial chromosome (BAC) cloning strategy to sequence entire MHC haplotypes from consanguineous cell lines as part of the MHC Haplotype Project. Here we present 4.25 Mb of the human haplotype QBL (HLA-A26-B18-Cw5-DR3-DQ2) and compare it with the MHC reference haplotype and with a second haplotype, COX (HLA-A1-B8-Cw7-DR3-DQ2), that shares the same HLA-DRB1, -DQA1, and -DQB1 alleles. We have defined the complete gene, splice variant, and sequence variation contents of all three haplotypes, comprising over 259 annotated loci and over 20,000 single nucleotide polymorphisms (SNPs). Certain coding sequences vary significantly between different haplotypes, making them candidates for functional and disease-association studies. Analysis of the two DR3 haplotypes allowed delineation of the shared sequence between two HLA class II–related haplotypes differing in disease associations and the identification of at least one of the sites that mediated the original recombination event. The levels of variation across the MHC were similar to those seen for other HLA-disparate haplotypes, except for a 158-kb segment that contained the HLA-DRB1, -DQA1, and -DQB1 genes and showed very limited polymorphism compatible with identity-by-descent and relatively recent common ancestry (<3,400 generations). These results indicate that the differential disease associations of these two DR3 haplotypes are due to sequence variation outside this central 158-kb segment, and that shuffling of ancestral blocks via recombination is a potential mechanism whereby certain DR–DQ allelic combinations, which presumably have favoured immunological functions, can spread across haplotypes and populations.
Highlights
The classical major histocompatibility complex (MHC) containing the human leukocyte antigen (HLA) loci on human Chromosome 6p21.31 is a gene-dense region spanning nearly 4 Mb
MHC sequences were divided into 10-kb bins, and variations were calculated in each bin
Red and blue plots relate to single nucleotide polymorphism (SNP) and deletion/insertion polymorphism (DIP) variations respectively
Summary
The classical major histocompatibility complex (MHC) containing the human leukocyte antigen (HLA) loci on human Chromosome 6p21.31 is a gene-dense region spanning nearly 4 Mb. The allelic and genetic structure of the MHC is complex. It harbours some of the most polymorphic genes in the genome, and sequences differ in size and gene composition partly as a result of non-allelic homologous recombination [1]. These extreme levels of polymorphism and dense genetic organisation, including highly reiterated sequences, have made particular parts of this biologically and medically important region less accessible to genome-wide analyses such as those employed by the Editor: Derry Roopenian, The Jackson Laboratory, United States of America
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.