Abstract
BackgroundProtein-DNA interactions are essential for fundamental biological activities including DNA transcription, replication, packaging, repair and rearrangement. Proteins interacting with DNA can be classified into two categories of binding mechanisms - sequence-specific and non-specific binding. Protein-DNA specific binding provides a mechanism to recognize correct nucleotide base pairs for sequence-specific identification. Protein-DNA non-specific binding shows sequence independent interaction for accelerated targeting by interacting with DNA backbone. Both sequence-specific and non-specific binding residues contribute to their roles for interaction.ResultsThe proposed framework has two stage predictors: DNA-binding residues prediction and binding mode prediction. In the first stage - DNA-binding residues prediction, the predictor for DNA specific binding residues achieves 96.45% accuracy with 50.14% sensitivity, 99.31% specificity, 81.70% precision, and 62.15% F-measure. The predictor for DNA non-specific binding residues achieves 89.14% accuracy with 53.06% sensitivity, 95.25% specificity, 65.47% precision, and 58.62% F-measure. While combining prediction results of sequence-specific and non-specific binding residues with OR operation, the predictor achieves 89.26% accuracy with 56.86% sensitivity, 95.63% specificity, 71.92% precision, and 63.51% F-measure. In the second stage, protein-DNA binding mode prediction achieves 75.83% accuracy while using support vector machine with multi-class prediction.ConclusionThis article presents the design of a sequence based predictor aiming to identify sequence-specific and non-specific binding residues in a transcription factor with DNA binding-mechanism concerned. The protein-DNA binding mode prediction was introduced to help improve DNA-binding residues prediction. In addition, the results of this study will help with the design of binding-mechanism concerned predictors for other families of proteins interacting with DNA.
Highlights
Protein-DNA interactions are essential for fundamental biological activities including DNA transcription, replication, packaging, repair and rearrangement
The independent testing data set used in each run was derived from 30 Transcription factors (TFs) chains randomly selected from the 253 TF-DNA complexes that we have collected
Aiming to obtain experimental results that accurately reflect the actual performance observed by the users of our proposed approach, we guaranteed that the training data generated with a TF chain that is homologous to the protein chain under testing by having a sequence identity higher than 20% are removed
Summary
Protein-DNA interactions are essential for fundamental biological activities including DNA transcription, replication, packaging, repair and rearrangement. Protein-DNA non-specific binding shows sequence independent interaction for accelerated targeting by interacting with DNA backbone. Protein-DNA interactions play important roles for the regulation of key biological functions like DNA transcription, replication, packaging and recombination. TFs regulate cell development, differentiation, and cell growth by binding to a specific DNA site and regulating gene expression [20,21,22] As it has been reported in a recent article that the tertiary structures of a large number of TFs are mostly disordered [23], sequence based analysis aimed at identifying the residues in a highlydisordered TF that play key roles in interaction with the DNA is essential for obtaining a comprehensive picture of how TFs function
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.