Abstract

BackgroundThe earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale.ResultsUsing the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy.ConclusionWe suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.

Highlights

  • The earliest whole protein order/disorder predictor (Uversky et al, Proteins, 41: 415-427 (2000)), called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale

  • Many computational tools have been developed for predicting Intrinsically disordered proteins (IDPs) and IDP regions from amino acid sequence, including several Predictors of Natural Disordered Regions (PONDR®s) [16,17,18,19], IUPred [20,21], DisoPred [7,22], SPINE-D[23], FoldIndex[24] and more than 50 others [25,26]

  • We further evaluated the results using metrics designed to evaluate predictors trained on imbalanced data (Table 2), including the F-score (Table 2 column 1), Matthews Correlation Coefficient (MCC, Table 2, column 2), Positive Predictive Values (PPV, Table 2, column 3), and Negative Predictive Values (NPV, Table 2, column 4, see Methods for more discussion of these metrics)

Read more

Summary

Introduction

The earliest whole protein order/disorder predictor (Uversky et al, Proteins, 41: 415-427 (2000)), called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale One of the more widely used prediction methods is based on a very simple model: repulsion from like charges favors unfolding while increased hydrophobicity favors folding [30]. In this approach, normalized net charge is plotted against normalized hydropathy, which is calculated from the hydropathy scale developed by KyteDoolittle (1982) [31], giving the charge-hydropathy (C-H) plot. This simple C-H plot largely separates IDPs from structured proteins [30]. This model has been used both for whole protein disorder prediction via the C-H plot [30] and for residue-by-residue disorder prediction via the FoldIndex algorithm [31]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.