Abstract

Protein-peptide interactions form an important subset of the total protein interaction network in the cell and play key roles in signaling and regulatory networks, and in major biological processes like cellular localization, protein degradation, and immune response. In this work, we have described the LMDIPred web server, an online resource for generalized prediction of linear peptide sequences that may bind to three most prevalent and well-studied peptide recognition modules (PRMs)—SH3, WW and PDZ. We have developed support vector machine (SVM)-based prediction models that achieved maximum Matthews Correlation Coefficient (MCC) of 0.85 with an accuracy of 94.55% for SH3, MCC of 0.90 with an accuracy of 95.82% for WW, and MCC of 0.83 with an accuracy of 92.29% for PDZ binding peptides. LMDIPred output combines predictions from these SVM models with predictions using Position-Specific Scoring Matrices (PSSMs) and string-matching methods using known domain-binding motif instances and regular expressions. All of these methods were evaluated using a five-fold cross-validation technique on both balanced and unbalanced datasets, and also validated on independent datasets. LMDIPred aims to provide a preliminary bioinformatics platform for sequence-based prediction of probable binding sites for SH3, WW or PDZ domains.

Highlights

  • Protein-protein interactions (PPIs) are primary regulators of protein functions [1], and a large number of PPIs are known to be mediated by short contiguous peptide segments, which are represented as sequence patterns known as Linear Motifs (LMs) [2]

  • The highly promiscuous binding patterns displayed by the peptide-binding domains, reflecting their intrinsic ability to recognize a diverse set of ligands, makes the prediction of specific domain-binding peptides a highly challenging task

  • We have compiled the positive training datasets comprising of experimentally validated SH3, WW, and PDZ domain binding peptides from the LMPID database [17]

Read more

Summary

Introduction

Protein-protein interactions (PPIs) are primary regulators of protein functions [1], and a large number of PPIs are known to be mediated by short contiguous peptide segments, which are represented as sequence patterns known as Linear Motifs (LMs) [2]. LM peptides are generally found in intrinsically disordered regions, and act as recognition sites for low-affinity but highly specific domain-peptide interactions, mediating PPIs that are transient, yet critical for various signaling and regulatory pathways [3]. Peptide-mediated PPIs have been implicated in several diseases like cancer and some neurodegenerative and genetic disorders [4]. Identification of such short LM peptide sequences within proteins may be useful in targeting specific disease-associated domain-peptide interactions for therapeutic modulation [5]. The computational challenge in predicting the occurrence of such peptides is that these

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call