Abstract
So far, in order to predict important sites of a protein, many computational methods have been developed. In the era of big-data, it is required for improvements and sophistication of existing methods by integrating sequence data in the structural data. In this paper, we aim at two things: improving sequence-based methods and developing a new method using both sequence and structural data. Therefore, we developed an originally modified evolutionary trace method, in which we defined conservative grades calculated from a given multiple sequence alignment and a proximate grade in order to evaluate predicted active sites from a viewpoint of protein-ion, protein-ligand, protein-nucleic acid, proteinprotein interaction by use of three-dimensional structures. In other words, the proximate grade also can evaluate an amino acid residue. When we applied our method to translation elongation factor Tu/1A proteins, it showed that the conservative grades are evaluated accurately by the proximate grade. Consequently, our idea indicated two advantages. One is that we can take into account various cocrystal structures for evaluation. Another one is that, by calculating the fitness between the given conservative grade and the proximate grade, we can select the best conservative grade.
Highlights
IntroductionA specific site to bind an ion or a molecule may exist. Identification of binding sites is important to investigate how the protein works and binds ions or molecules
When a protein works, a specific site to bind an ion or a molecule may exist
In order to identify such an important site, it is necessary to prepare a mutant type of the protein, whose amino acid residue is mutated into another one, and a difference of binding affinity between the mutant type and the wild type is investigated
Summary
A specific site to bind an ion or a molecule may exist. Identification of binding sites is important to investigate how the protein works and binds ions or molecules. We consider a map, a mathematical formula, on a multiple sequence alignment (MSA) and aim at constructing an exhaustive method. As part of this effort, we propose a method currently including some existing methods such as the method based on SE or SE of residue properties, the method based on a sum of pairs with/without weighting and the iv-ET or the rv-ET method. Protein structures derived from different organisms are incomparable with each other To solve these problems, we consider another map, which measures proximity of amino acid residues and ions or molecules, and two maps are integrated. ⊆ iM denote a set of residues in i M denote a set of residues in i M and {i1γ , 2iγ ,..., Giγ=} i G ⊂ i M denote a set of gaps in i M .Let ( ) ( ) f2
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Data Mining in Genomics & Proteomics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.