Abstract
BackgroundThe interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. Hot spots are a small set of residues that contribute most to the binding affinity of a protein-nucleic acid interaction. Compared to the extensive studies of the hot spots on protein-protein interfaces, the hot spot residues within protein-nucleic acids interfaces remain less well-studied, in part because mutagenesis data for protein-nucleic acids interaction are not as abundant as that for protein-protein interactions.ResultsIn this study, we built a new computational model, iPNHOT, to effectively predict hot spot residues on protein-nucleic acids interfaces. One training data set and an independent test set were collected from dbAMEPNI and some recent literature, respectively. To build our model, we generated 97 different sequential and structural features and used a two-step strategy to select the relevant features. The final model was built based only on 7 features using a support vector machine (SVM). The features include two unique features such as ∆SASsa1/2 and esp3, which are newly proposed in this study. Based on the cross validation results, our model gave F1 score and AUROC as 0.725 and 0.807 on the subset collected from ProNIT, respectively, compared to 0.407 and 0.670 of mCSM-NA, a state-of-the art model to predict the thermodynamic effects of protein-nucleic acid interaction. The iPNHOT model was further tested on the independent test set, which showed that our model outperformed other methods.ConclusionIn this study, by collecting data from a recently published database dbAMEPNI, we proposed a new model, iPNHOT, to predict hotspots on both protein-DNA and protein-RNA interfaces. The results show that our model outperforms the existing state-of-art models. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPNHOT/.
Highlights
The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation
The interface hot spot residues provide clues to understand the principles driving the interaction between protein and nucleic acids
We collected a nonredundant training dataset with 293 alanine-mutated residues on protein-nucleic acid interfaces from dbAMEPNI database. Based on this data set, we developed a single knowledge-based method to predict hot spot residues on both protein-DNA and protein-RNA interfaces
Summary
The interaction between proteins and nucleic acids plays pivotal roles in various biological processes such as transcription, translation, and gene regulation. The interaction of proteins with nucleic acids is essential in many different cellular processes, such as translation, RNA-metabolism, gene regulation, DNA replication and repair, and so on [1, 2]. While hotspots on protein-protein interfaces have been extensively studied by both experimental and computational methods [6, 9,10,11,12,13,14,15,16], the hotspots on protein-nucleic acid interfaces are not as comprehensively investigated. Very few of the energetic data about the residues on protein-nucleic acid interfaces were collected in the past decades, which make the development of computational methods at a slow pace
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.