Abstract

Intrinsically disordered proteins (IDPs) are characterized by the lack of a fixed tertiary structure and are involved in the regulation of key biological processes via binding to multiple protein partners. IDPs are malleable, adapting to structurally different partners, and this flexibility stems from features encoded in the primary structure. The assumption that universal sequence information will facilitate coverage of the sparse zones of the human interactome motivated us to explore the possibility of predicting protein-protein interactions (PPIs) that involve IDPs based on sequence characteristics. We developed a method that relies on features of the interacting and non-interacting protein pairs and utilizes machine learning to classify and predict IDP PPIs. Consideration of both sequence determinants specific for conformational organizations and the multiplicity of IDP interactions in the training phase ensured a reliable approach that is superior to current state-of-the-art methods. By applying a strict evaluation procedure, we confirm that our method predicts interactions of the IDP of interest even on the proteome-scale. This service is provided as a web tool to expedite the discovery of new interactions and IDP functions with enhanced efficiency.

Highlights

  • Disordered proteins (IDPs) represent a structural class of proteins that do not have well-defined tertiary structures in several regions or throughout the entire sequence[6,7,8]

  • The interactome made of binary protein-protein interactions (PPIs) that involve at least one Intrinsically disordered proteins (IDPs) from the DisProt was extracted from the Human Integrated Protein-Protein Interaction rEference (HIPPIE) database[25], a source specialized in human interactions, which combines information on PPIs with experimental annotation from ten primary repositories

  • IDPs are distinct due to their compositional bias which influences their binding propensities and selection of partners. This motivated us to develop IDPpi, a method for PPI predictions which utilizes supervised machine learning algorithms and compositional content together with the distribution of features associated with the promotion of structural disorder along the sequence string

Read more

Summary

Introduction

Disordered proteins (IDPs) represent a structural class of proteins that do not have well-defined tertiary structures in several regions or throughout the entire sequence[6,7,8]. IDPs are the prevailing protein class associated with noncommunicable diseases[18,19] and mapping the interactome of IDPs will lead to improved understanding of disease mechanisms and provide the platform for novel therapeutic approaches[20,21] It is an important task of computational biology to provide model PPI networks of IDPs and to enable reliable predictions of candidate interactors. The flexible structure of IDPs imposes restrictions on the application of precise computational methods for modelling interactions based on docking This is because they rely on putative binding modes according to favourable interaction energies and surface complementarities, requiring candidates with stable, well-defined, globular three-dimensional structures[22,23]. With a strict evaluation procedure and attention to potential setbacks intrinsic to data-based methods, we demonstrate that our classifier grasps key characteristics of PPIs and is capable of predicting interactions of an IDP of interest with significant efficiency

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call