Inadequacy of Evolutionary Profiles Vis-a-vis Single Sequences in Predicting Transient DNA-Binding Sites in Proteins

Ajay Arya,Dana Mary Varghese,Ajay Kumar Verma,Shandar Ahmad

doi:10.1016/j.jmb.2022.167640

Abstract

Sequence-based prediction of DNA-binding residues in a protein is a widely studied problem for which machine learning methods with continuously improving predictive power have been developed. Concatenated rows within a sliding window of a Position Specific Substitution Matrix (PSSM) of the protein are currently used as the primary feature set in almost all the methods of predicting DNA-binding residues. Here we report that these evolutionary profiles are powerful, only for identifying conserved binding sites and fall short for the residue positions which undergo binding to non-binding transitions in closely related proteins. We created a database of highly similar protein pairs with known protein-DNA complexes and investigated differential predictability of conserved and transient binding residues within each pair. Retraining machine learning models uniformly, we compared the predictive powers of the models trained on PSSMs against similarly trained models on sparse-encoded single sequences. We found that the transient binding site predictions from evolutionary profiles are outperformed by single-sequence based models under controlled experiments by as much as 8 percentage points. Thus, we conclude that the PSSM-based models are inadequate to predict high-specificity DNA-binding residues. These findings are of critical significance for the design of mutant- and species-specific DNA ligands and for homology based modeling of protein-DNA complexes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Inadequacy of Evolutionary Profiles Vis-a-vis Single Sequences in Predicting Transient DNA-Binding Sites in Proteins

Abstract

Talk to us

Similar Papers

More From: Journal of Molecular Biology

Lead the way for us

Journal: Journal of Molecular Biology	Publication Date: May 18, 2022
Citations: 2

Similar Papers

DNABind: A hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches
Rong Liu ... Jianjun Hu
Proteins: Structure, Function, and Bioinformatics | VOL. 81
Rong Liu, et. al.Rong Liu ... Jianjun Hu
16 Aug 2013
Proteins: Structure, Function, and Bioinformatics | VOL. 81

DeepDBS: Identification of DNA-binding sites in protein sequences by using deep representations and random forest
Yaser Daanial Khan ... Ahmad Hassan Butt
Methods | VOL. 231
Yaser Daanial Khan, et. al.Yaser Daanial Khan ... Ahmad Hassan Butt
11 Sep 2024
Methods | VOL. 231

A survey on protein-DNA-binding sites in computational biology.
Yue Zhang ... Wenzheng Bao
Briefings in Functional Genomics | VOL. 21
Yue Zhang, et. al.Yue Zhang ... Wenzheng Bao
01 Jun 2022
Briefings in Functional Genomics | VOL. 21

The prediction power of machine learning on estimating the sepsis mortality in the intensive care unit
Mehtap Selcuk ... A Sevtap Kestel
Informatics in Medicine Unlocked | VOL. 28
Mehtap Selcuk, et. al.Mehtap Selcuk ... A Sevtap Kestel
01 Jan 2021
Informatics in Medicine Unlocked | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Inadequacy of Evolutionary Profiles Vis-a-vis Single Sequences in Predicting Transient DNA-Binding Sites in Proteins

Abstract

Talk to us

Similar Papers

More From: Journal of Molecular Biology