Abstract
Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
Highlights
Protein-nucleic acid interactions are central to various fundamental biological processes, especially those related to replication, transcription, and translation [1, 2]
Focusing on the feature-based prediction, we find that the performance on DB96/RB105 was clearly inferior to that on DB216/RB159, suggesting that the binding residues in these query sequences whose remote template cannot be found by hidden Markov models (HMMs)-based search may be more difficultly identified by feature-based method
Inspired by our earlier findings that both DNA- and RNA-binding residue predictions can be remarkably improved by merging structure-based template and feature methods, here we developed a sequence-based hybrid algorithm SNBRFinder for predicting nucleic acid-binding residues
Summary
Protein-nucleic acid interactions are central to various fundamental biological processes, especially those related to replication, transcription, and translation [1, 2]. As the number of experimentally solved protein-nucleic acid complexes increases steadily, a variety of structure-based algorithms have been proposed to predict DNA- or RNA-binding residues. These approaches can roughly be partitioned into three classes: (i) the feature-based class; (ii) the template-based class; (iii) the hybrid class. Not all the proteins can achieve reasonable templates to identity their binding residues In these cases, the feature-based predictor might complement the shortcoming of the template-based predictor. The feature-based predictor might complement the shortcoming of the template-based predictor Based on this assumption, we recently built two hybrid predictors DNABind and RBRDetector [21, 22], belonging to the third class, which respectively improved DNA- and RNA-binding residue predictions by leveraging the complementary nature of feature- and template-based approaches. Due to the fact that the number of solved structures substantially lags behind that of protein sequences, it is more urgent to develop effective and efficient computational tools for annotating nucleic acid-binding residues from protein sequence
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.