A haplotype is a string of nucleotides or alleles at nearby loci on one chromosome, usually inherited as a unit. Within the major histocompatibility complex (MHC) region on human chromosome 6p, independent population studies of multiple families have identified conserved extended haplotypes (CEHs) that segregate as long stretches (≥1 megabase) of essentially identical DNA sequence at relatively high (≥0.5 %) population frequency ("genetic fixity"). CEHs were first identified through segregation analysis in the early 1980s. In European Caucasian populations, the most frequent 30 CEHs account for at least one-third of all MHC haplotypes. These CEHs provide all of the known individual MHC susceptibility and protective genetic markers within those populations for several complex genetic diseases. Haplotypes are rigorously determined directly by sequencing single chromosomes or by Mendelian segregation analysis using families with informative genotypes. Four parental haplotypes are assigned unambiguously using genotypes from the two parents and from two of their haploidentical (to each other) children. However, the most common current technique to phase haplotypes is probabilistic statistical imputation, using unrelated subjects. Such probabilistic techniques have failed to detect CEHs and are thus of questionable value in identifying long-range haplotype structure and, consequently, genetic structure-function relationships. Finally, with haplotypes rigorously defined, association studies can determine frequencies of alleles among unrelated patient haplotypes vs. those among only unaffected family members (i.e., control alleles/haplotypes). Such studies reduce, as much as possible, the confounding effects of population stratification common to all genetic studies.
Read full abstract