Abstract

BackgroundFalse occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution?ResultsHere we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms.ConclusionOur thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences.

Highlights

  • False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome

  • The PROSITE database provides, for each entry, complete lists of Swiss-Prot proteins manually verified for true positive (TP), false positive (FP), and false negative (FN) assignments [4]

  • True and false positives of PROSITE patterns are manually verified by expert curators through both the literature and the information retrieved from other databases such as Swiss-Prot or Pfam [13]

Read more

Summary

Introduction

False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Sternberg [2] assumed the calculated expectations as a benchmark for evaluating motif matches on the Swiss-Prot database as annotated in PROSITE; Nevill-Manning and co-workers [3] used such expectations for assessing the specificity of motifs exhaustively generated from a multiple sequence alignment of related proteins. From this perspective, the number of occurrences of a motif in a set of proteins can be regarded as the sum of the functional occurrences plus the random occurrences, i.e. motif matches explained by the sequence composition alone [6]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call