Abstract

We created a new library of disordered patterns and disordered residues in the Protein Data Bank (PDB). To obtain such datasets, we clustered the PDB and obtained the groups of chains with different identities and marked disordered residues. We elaborated a new procedure for finding disordered patterns and created a new version of the library. This library includes three sets of patterns: unique patterns, patterns consisting of two kinds of amino acids, and homo-repeats. Using this database, the user can: (1) find homologues in the entire Protein Data Bank; (2) perform a statistical analysis of disordered residues in protein structures; (3) search for disordered patterns and homo-repeats; (4) search for disordered regions in different chains of the same protein; (5) download clusters of protein chains with different identity from our database and library of disordered patterns; and (6) observe 3D structure interactively using MView. A new library of disordered patterns will help improve the accuracy of predictions for residues that will be structured or unstructured in a given region.

Highlights

  • Disordered proteins and regions are very important for many eukaryotic cell processes [1,2,3,4,5,6]

  • In order to obtain the correct statistics of disordered residues and create a library of disordered residues and patterns, we should perform Protein Data Bank (PDB) clustering, which is a necessary procedure for processing big data

  • We examined all protein structures determined by x-ray diffraction analysis with a resolution higher than 3 Å and a protein size greater than or equal to 40 amino acid residues, published in the PDB; 150,912 PDB entries contained 277,583 protein chains

Read more

Summary

Introduction

Disordered proteins and regions are very important for many eukaryotic cell processes [1,2,3,4,5,6]. Interest in intrinsically disordered regions has only increased since. RNA-binding proteins with prion-like domains such as FUS, TDP-43, and others with large intrinsic disordered domains are involved in processes such as liquid-gel phase transition [7,8,9,10]. Virus shell proteins have disordered regions that may be important for antiviral vaccine development [11,12,13]. It has been demonstrated that the functions of intrinsically disordered regions are both lengthand position-dependent [15]. The functional importance of many disordered regions and patterns remains unclear. In order to obtain the correct statistics of disordered residues and create a library of disordered residues and patterns, we should perform Protein Data Bank (PDB) clustering, which is a necessary procedure for processing big data

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call