Abstract

BackgroundMinimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions.Methodology/Principal FindingsCertain domains and minimotifs are known to be strongly associated with a known cellular process or molecular function. Therefore, we hypothesized that by restricting minimotif predictions to those where the minimotif containing protein and target protein have a related cellular or molecular function, the prediction is more likely to be accurate. This filter was implemented in Minimotif Miner using function annotations from the Gene Ontology. We have also combined two filters that are based on entirely different principles and this combined filter has a better predictability than the individual components.Conclusions/SignificanceTesting these functional filters on known and random minimotifs has revealed that they are capable of separating true motifs from false positives. In particular, for the cellular function filter, the percentage of known minimotifs that are not removed by the filter is ∼4.6 times that of random minimotifs. For the molecular function filter this ratio is ∼2.9. These results, together with the comparison with the published frequency score filter, strongly suggest that the new filters differentiate true motifs from random background with good confidence. A combination of the function filters and the frequency score filter performs better than these two individual filters.

Highlights

  • Minimotifs are short contiguous peptide pieces of proteins that have a known biological function

  • This limits the usefulness of minimotif prediction programs such as Minimotif Miner (MnM) [1,2], Eukaryotic Linear Motif (ELM) [3,4], and ScanSite [5,6]

  • In ScanSite [5,6], minimotifs are described as position-specific scoring matrices (PSSMs) that indicate the frequency of each amino acid at each position using data derived from peptide library and phage display experiments [8,9]

Read more

Summary

Introduction

Minimotifs are short contiguous peptide pieces of proteins that have a known biological function. While there are many known functional minimotifs, predicting a minimotif in a new protein based on a consensus sequence, position-specific scoring matrix, or other algorithms produces many false-positive predictions. This limits the usefulness of minimotif prediction programs such as Minimotif Miner (MnM) [1,2], Eukaryotic Linear Motif (ELM) [3,4], and ScanSite [5,6]. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.