Abstract

Intrinsically disordered proteins perform a variety of crucial biological functions despite lacking stable tertiary structure under physiological conditions in vitro. State-of-the-art sequence-based predictors of intrinsic disorder are achieving per-residue accuracies over 80%. In a genome-wide study we observed big difference in predicted disorder content between confirmed and putative human proteins, and suspected that this is due to large errors introduced by gene-finding algorithms for putative sequence annotation. To test this hypothesis we trained a predictor to discriminate sequences of real proteins from synthetic sequences that mimic errors of gene finding algorithms. Its application to putative human protein sequences shows that they contain a substantial fraction of incorrectly assigned regions. These regions are predicted to have higher levels of disorder content than correctly assigned regions. Our finding provides first evidence that current practice of predicting disorder content in putative sequences should be reconsidered, as such estimates are biased.

Highlights

  • Disordered proteins (IDPs) and regions (IDRs) perform a variety of crucial biological functions despite lacking stable tertiary structure under physiological conditions in vitro

  • The same difference is preserved in the dataset for Homo sapiens (Figure 5), there is a change in the distribution of disorder content for XP sequences

  • This difference was consistent with the divergence in amino acid composition for NP and XP sequences (Figure 3), since several order-promoting amino acids were highly enriched in NP sequences, and several disorder-promoting amino acids were highly enriched in XP sequences

Read more

Summary

Introduction

Disordered proteins (IDPs) and regions (IDRs) perform a variety of crucial biological functions despite lacking stable tertiary structure under physiological conditions in vitro. IDPs possess a low content of N and of the cross-linking C residues and are significantly depleted in bulky hydrophobic (I, L, and V) and aromatic amino acid residues (W, Y, and F), which form and stabilize the hydrophobic cores of folded globular proteins. These amino acids have been called order-promoting amino acids.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.