Estimation of HLA-A, -B, -DRB1 Haplotype Frequencies Using Mixed Resolution Data from a National Registry with Selective Retyping of Volunteers

Craig Kollman,Martin Maiers,Loren Gragert,Carlheinz Müller,Michelle Setterholm,Machteld Oudshoorn,Carolyn Katovich Hurley

doi:10.1016/j.humimm.2007.10.009

Abstract

Large registries of volunteer hematopoietic stem cell donors typed for HLA contain potentially valuable data for studying haplotype frequencies in the general population. However the usual assumptions for use of the expectation-maximization (EM) algorithm are typically violated in these registries. To avoid this problem, previous studies using registry data have reduced the HLA typings to low-resolution and/or excluded subjects who were selected for testing on behalf of a specific patient ("patient-directed" typings). These restrictions, added to avoid bias from selection of nonrepresentative volunteers for higher-resolution typing, have limited previous results to haplotypes defined at low resolution. In this article we eliminate the need for such restrictions by formally relaxing the assumptions necessary for the EM algorithm. We show mathematically and through simulation that varying levels of resolution can be incorporated even if the level of typing resolution is chosen based on the HLA type. This allows use of intermediate and high resolution data from patient-directed typings to extend haplotype frequency estimates to the allele level for HLA-DRB1. We demonstrate the feasibility of using this computationally demanding algorithm on large datasets by applying it to more than 3 million volunteers listed in the National Marrow Donor Program Registry.

Full Text