Abstract

BackgroundGenetic variation associated with human leukocyte antigen (HLA) genes has immunological functions and is associated with autoimmune diseases. To date, large-scale studies involving classical HLA genes have been limited by time-consuming and expensive HLA-typing technologies. To reduce these costs, single-nucleotide polymorphisms (SNPs) have been used to predict HLA-allele types. Although HLA allelic distributions differ among populations, most prediction model of HLA genes are based on Caucasian samples, with few reported studies involving non-Caucasians.ResultsOur sample consisted of 437 Han Chinese with Affymetrix 5.0 and Illumina 550 K SNPs, of whom 214 also had data on Affymetrix 6.0 SNPs. All individuals had HLA typings at a 4-digit resolution. Using these data, we have built prediction model of HLA genes that are specific for a Han Chinese population. To optimize our prediction model of HLA genes, we analyzed a number of critical parameters, including flanking-region size, genotyping platform, and imputation. Predictive accuracies generally increased both with sample size and SNP density.ConclusionsSNP data from the HapMap Project are about five times more dense than commercially available genotype chip data. Using chips to genotype our samples, however, only reduced the accuracy of our HLA predictions by only ~3%, while saving a great deal of time and expense. We demonstrated that classical HLA alleles can be predicted from SNP genotype data with a high level of accuracy (80.37% (HLA-B) ~95.79% (HLA-DQB1)) in a Han Chinese population. This finding offers new opportunities for researchers in obtaining HLA genotypes via prediction using their already existing chip datasets. Since the genetic variation structure (e.g. SNP, HLA, Linkage disequilibrium) is different between Han Chinese and Caucasians, and has strong impact in building prediction models for HLA genes, our findings emphasize the importance of building ethnic-specific models when analyzing human populations.

Highlights

  • Genetic variation associated with human leukocyte antigen (HLA) genes has immunological functions and is associated with autoimmune diseases

  • All individuals were HLA typed at a 4digit level resolution at 6 HLA loci and were used for training and testing the prediction model of HLA genes. To optimize these prediction models for classical HLA class I and class II genes, we addressed the following questions: 1) would there be differences in HLA allele distributions and optimal flanking regions between our Han Chinese data set and the Haplotype Map (HapMap) Caucasian samples, 2) could major histocompatibility complex (MHC) single-nucleotide polymorphisms (SNPs) data generated by different platforms yield comparably accurate HLA allele predictions, and 3) could imputation of untyped MHC SNPs improve the accuracy and robustness of the model? We provide practical recommendations concerning ethnic-specific prediction model of HLA genes regarding HLA gene regions, platforms, and imputation

  • We found that prediction model of HLA genes that were built with imputation typically provided greater prediction accuracy, which underscores the positive effect of using a higher density of SNPs

Read more

Summary

Introduction

Genetic variation associated with human leukocyte antigen (HLA) genes has immunological functions and is associated with autoimmune diseases. Large-scale studies involving classical HLA genes have been limited by time-consuming and expensive HLA-typing technologies To reduce these costs, single-nucleotide polymorphisms (SNPs) have been used to predict HLA-allele types. With the advent of high-throughput genotyping technologies, it is relatively easy to obtain large-scale, genomewide data concerning single-nucleotide polymorphisms (SNPs) in humans This allows for more thorough analyses of questions that involve population genetics. Previous comparative studies have shown that immune systems are generally under strong selective pressures, which are likely driven by virus-host interactions [5,6] Because of these selective pressures, comparisons between ethnic groups reveal linkage disequilibrium and highly variable patterns of allelic distributions for HLA genes [5,7]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call