Abstract

BackgroundTo uncover molecular functions and networks in biological cellular systems, it is important to dissect interactions between proteins and RNAs. Many studies have been performed to investigate and analyze interactions between protein amino acid residues and RNA bases. In terms of interactions between residues in proteins, it is generally accepted that an amino acid residue at interacting sites has coevolved together with the partner residue in order to keep the interaction between residues in proteins. Based on this hypothesis, in our previous study to identify residue-residue contact pairs in interacting proteins, we made calculations of mutual information (M I) between amino acid residues from some multiple sequence alignment of homologous proteins, and combined it with a discriminative random field (DRF) approach, which is a special type of conditional random fields (CRFs) and has been proved useful for the purpose of extracting distinguishing areas from a photograph in the image processing field. Recently, the evolutionary correlation of interactions between residues and DNA bases has also been found in certain transcription factors and the DNA-binding sites.ResultsIn this paper, we employ more generic two-dimensional CRFs than such DRFs to predict interactions between protein amino acid residues and RNA bases. In addition, we introduce labels representing kinds of amino acids and bases as local features of a CRF. Furthermore, we examine the utility of L1-norm regularization (lasso) for the CRF. For evaluation of our method, we use residue-base interactions between several Pfam domains and Rfam entries, conduct cross-validation, and calculate the average AUC (Area under ROC Curve) score. The results suggest that our CRF-based method using mutual information and labels with the lasso is useful for further improving the performance, especially provided that the features of CRF are successfully reduced by the lasso approach.ConclusionsWe propose simple and generic two-dimensional CRF models using labels and mutual information with the lasso. Use of the CRF-based method in combination with the lasso is particularly useful for predicting the residue-base contacts in protein-RNA interactions.

Highlights

  • It is essential to understand the organization and evolution of cellular systems and molecular networks through the analysis of interactions and molecular recognition

  • We provided ordinary mutual information between two positions obtained from multiple alignments as an input to conditional random fields (CRFs) [20]

  • The results suggest that the CRF-based method using mutual information and labels with the lasso is useful

Read more

Summary

Results

For the evaluation of our proposed CRF, computational experiments were performed in both contact definitions of 3 Å and 5 Å. Results on average AUC scores for test pairs using the contact definition of 5 Å, M I, M Ip, labels representing kinds of amino acids and bases, and the grouping of amino acids with lasso parameter C = 0, 1, and 2. It means that parameter reduction by the lasso contributed to the decrease of execution time All together, these results suggest that the CRF-based method using mutual information and labels representing kinds of amino acids and bases with the lasso is very useful for further improving the prediction performance

Conclusions
Introduction
Zv exp
Conclusion
Draper D
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.