Abstract

The major histocompatibility complex (MHC), a cell-surface protein mediating immune recognition, plays important roles in the immune response system of all higher vertebrates. MHC molecules are highly polymorphic and they are grouped into serotypes according to the specificity of the response. It is a common belief that a protein sequence determines its three dimensional structure and function. Hence, the protein sequence determines the serotype. Residues play different levels of importance. In this paper, we quantify the residue significance with the available serotype information. Knowing the significance of the residues will deepen our understanding of the MHC molecules and yield us a concise representation of the molecules. In this paper we propose a linear programming-based approach to find significant residue positions as well as quantifying their significance in MHC II DR molecules. Among all the residues in MHC II DR molecules, 18 positions are of particular significance, which is consistent with the literature on MHC binding sites, and succinct pseudo-sequences appear to be adequate to capture the whole sequence features. When the result is used for classification of MHC molecules with serotype assigned by WHO, a 98.4 percent prediction performance is achieved. The methods have been implemented in java (http://code.google.com/p/quassi/).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call