Abstract

Prediction of functional peptide motifs from sequences is an important problem in bioinformatics. Experimental discovery of functional sequences is laborious. Searches for specific motifs within the increasing number of proteins available in public databases often involve extensive computer calculations. Short peptide motifs are especially hard to identify via currently available methods. Presented here is a simple and effective procedure to identify a short functional motif. The procedure is based on devising a scoring function using sequence properties. The procedure was applied on short engrailed homology-1 (eh1)-like motif. Eh1-like motif provides repressive functions by binding to the WD domain of the Gro/TLE transcriptional corepressors. Interactions of known eh1-like variants and the WD domain were modeled and studied. Sequence features crucial for the interactions, and thus the motif's functionality, were identified. A scoring function was devised based on the observed sequence features. The ability of the scoring function to discriminate between functional and nonfunctional sequences was tested on known eh1-like sequences, random sequences, and eh1-like sequences predicted by others using various bioinformatics tools. The scoring function expressed well a general relationship between sequences and their functionalities. It gave about 20% false positive findings. However, the scoring function reliably identified sequences that were not functional eh1-like motif. The procedure presented here may prove useful for predicting functional sequences of other short motifs. Given the importance of transcriptional regulation, this study on identification and evaluation of functional eh1-like sequences should facilitate further research on their transcriptional roles.

Highlights

  • Development of dependable computational methods to identify functional peptide motifs is an important quest in bioinformatics

  • Extensive computer calculations are often needed to search for specific motifs within protein databases

  • The random sequences were checked against protein databases to make sure they were not functional

Read more

Summary

Introduction

Development of dependable computational methods to identify functional peptide motifs is an important quest in bioinformatics. Experimental discovery of functional peptide sequences is often arduous. High-throughput methods give a large number of false positive and false negative results. Extensive computer calculations are often needed to search for specific motifs within protein databases. Identification of short motifs using currently available methods is especially difficult. Multiple sequence alignments are tools used in several motif-searching programs. One of them is a powerful program Clustal W [1]. Aligning sequences by Clustal W requires significant computational resources. It can take hours of sequential computing to align a few hundred sequences

Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call