Abstract

The major histocompatibility complex (MHC) on chromosome 6p21 is one of the most single-nucleotide polymorphism (SNP)-dense regions of the human genome and a prime model for the study and understanding of conserved sequence polymorphisms and structural diversity of ancestral haplotypes/conserved extended haplotypes. This study aimed to follow up on a previous analysis of the MHC class I region by using the same set of 95 MHC haplotype sequences downloaded from a publicly available BioProject database at the National Center for Biotechnology Information to identify and characterize the polymorphic human leukocyte antigen (HLA)-class II genes, the MTCO3P1 pseudogene alleles, the indels of transposable elements as haplotypic lineage markers, and SNP-density crossover (XO) loci at haplotype junctions in DNA sequence alignments of different haplotypes across the extended class II region (∼1 Mb) from the telomeric PRRT1 gene in class III to the COL11A2 gene at the centromeric end of class II. We identified 42 haplotypic indels (20 Alu, 7 SVA, 13 LTR or MERs, and 2 indels composed of a mosaic of different transposable elements) linked to particular HLA-class II alleles. Comparative sequence analyses of 136 haplotype pairs revealed 98 unique XO sites between SNP-poor and SNP-rich genomic segments with considerable haplotype shuffling located in the proximity of putative recombination hotspots. The majority of XO sites occurred across various regions including in the vicinity of MTCO3P1 between HLA-DQB1 and HLA-DQB3, between HLA-DQB2 and HLA-DOB, between DOB and TAP2, and between HLA-DOA and HLA-DPA1, where most XOs were within a HERVK22 sequence. We also determined the genomic positions of the PRDM9-recombination suppression sequence motif ATCCATG/CATGGAT and the PRDM9 recombination activation partial binding motif CCTCCCCT/AGGGGAG in the class II region of the human reference genome (NC_ 000006) relative to published meiotic recombination positions. Both the recombination and anti-recombination PRDM9 binding motifs were widely distributed throughout the class II genomic regions with 50% or more found within repeat elements; the anti-recombination motifs were found mostly in L1 fragmented repeats. This study shows substantial haplotype shuffling between different polymorphic blocks and confirms the presence of numerous putative ancestral recombination sites across the class II region between various HLA class II genes.

Highlights

  • Several historical studies show that statistically inferred haplotype sequences often miss the importance of conserved polymorphic sequence (CPS) of the conserved extended haplotype (CEH) (Alper et al, 2006) and ancestral haplotypes (AH) (Dawkins et al, 1999) in matching donors and recipients for transplantations and for identifying the haplotypes involved in autoimmune diseases (Dawkins et al, 1983) such as type 1 diabetes (T1D) (Alper and Larsen, 2017)

  • We investigated the occurrence of transposable element (TE) indels and haplotype exchanges in class I genomic region (Kulski et al, 2021) and broadened our analysis to TE indels and haplotype switching at the junctions between single-nucleotide polymorphism (SNP)-rich and SNP-poor blocks in the class II region, covering 620 kb of genomic sequence from the human leukocyte antigen (HLA)-DRB1 gene to the COL11A2 gene (Figure 1)

  • The HLA-class II alleles for HLA-DRB1, -DRB2, -DRB3, and -DRB4 and -DRB5, -DQA1, -DQB1, -DPA1, and -DPB1 were determined by Norman et al (2017), but to better assess the structure of the haplotype changes, we included the alleles for HLA-DQA2, -DQB2, -DOB, -DOA, -DPB2, and DPA3 and the pseudogene MTCO3P1

Read more

Summary

Introduction

Haplotypes are combinations of alleles at different loci of phased DNA segregating together in multigenerational families (Bodmer et al, 1986; Lloyd et al, 2016; Alper and Larsen, 2017) essentially as DNA sequences that are identical by descent (IBD) via recent shared ancestry (Druet and Farnir, 2011; Browning and Browning, 2012; Thompson, 2013; Zhou et al, 2020b). After the transition into the third millennium and the publication of the analysis of the first human genomic sequence (International Human Genome Sequencing Consortium, 2001; Venter et al, 2001), haplotype studies began to spread in earnest from the continuous analysis of the MHC super locus (Jeffreys et al, 2001; Ahmad et al, 2003; Kauppi et al, 2003; Miretti et al, 2005; Blomhoff et al, 2006) to other regions of the human genome IBD segmental mapping of recent ancestry between individuals in families and populations based on sequence similarity, genotypes, and single-nucleotide polymorphism (SNP) profiles is a newly developed and tested imputation used either with or without LD analysis for inferred haplotype detection (Browning and Browning, 2012; Thompson, 2013; Zhou et al, 2020b)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call