The discovery and high-throughput structure modeling of specific membrane protein families, such as G-protein coupled receptors (GPCRs) and channel proteins from complete genome sequences, are important issues in pharmacogenomics and structure-based drug design. Membrane protein families from other protein sequences have been detected and classified with high accuracy by means of homology search, hidden Markov model, physicochemical profiles, and neural networks [5]. Structure modeling of membrane proteins, by (1) comparative modeling of GPCRs using the structural template of bovine rhodopsin and (2) several non-statistical approaches has also been proposed [2, 3, 8, 7]. Transmembrane (TM) regions in membrane proteins play a major role in structure modeling due to the functional insights they provide and their simple architecture, which consists of a membrane spanning a-helix composed of approximately a 20-30 amino acids. On the other hand, inter-helical (IH) loop segments that link TM helices other than the large loop segments formed in a compact soluble domain, have been omitted from the structure modeling process. However, a couple of recent crystallography and diffraction studies of membrane proteins, such as potassium and water channel proteins, have revealed that the specific medium loop segments between TM helices fold back onto membrane positions and play an important role in membrane protein folding and biological function [1, 6]. Previous methods of TM helix prediction have not considered the contribution of these IH loop groups. Consequently, specific medium loop segment is often predicted as a TM helix. This false-positive prediction causes serious problems in TM topology prediction and structure modeling. Here, we analyzed the IH loop segments in proteins having multi-TM helices for classification and detection of the specific medium loop segment from amino acid sequences. The results show that the specific loop length and extra-cellular environment are dominant factor for the classification of IH loop types.
Read full abstract