Abstract
Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set.
Highlights
Integral membrane proteins constitute 25–30% of the genes in a given genome [1,2,3] and play crucial roles in cell biology by allowing cells to interact with their environment; they constitute pharmacological targets for around 50% of active drugs on the market [4,5]
1.1 AlignMe Program Overview AlignMe is a protein sequence alignment tool developed in C++, which was designed to allow multiple different protein descriptors to be considered simultaneously when defining the similarity between two aligned positions
We first constructed an updated set of homologous membrane protein structures (HOMEP2; see Section 1.3), and structure-based sequence alignments of these proteins were used as a reference
Summary
Integral membrane proteins constitute 25–30% of the genes in a given genome [1,2,3] and play crucial roles in cell biology by allowing cells to interact with their environment; they constitute pharmacological targets for around 50% of active drugs on the market [4,5] The study of these proteins is of considerable interest. Only ,1000 high-resolution membrane protein structures are so far available in the Protein Data Bank [8], of which ,400 are unique (http://blanco.biomol.uci.edu/mpstruc/) This situation has motivated many researchers to turn to remote-template homology modeling, in which the unknown structure of a target sequence is modeled on a known (template) structure of a distantly-related protein, in order to gain insights into membrane protein function. The membraneprotein-specific multiple-sequence alignment method PRALINETM [9], for example, manages to recapitulate only ,40% of the columns in alignments in the BAliBASE membrane protein reference set 7 [10], suggesting that further improvements are needed
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.