Abstract

BackgroundOne of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function.ResultsHere, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM.ConclusionsOur findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.

Highlights

  • One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function

  • We propose an approach inspired by this last category of studies to identify structural motifs in loops involved in protein function

  • Extraction of structural motifs over-represented in SCOP superfamilies The goal of our study is to systematically identify structural motifs of interest, i.e. motifs with structural or functional implication, in protein loops

Read more

Summary

Introduction

One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. From the loop classification system ArchDB [3], Espadaler et al [14], developed an approach to identify loop clusters associated with the protein functional sites provided by the PROSITE database [15] or Gene Ontology (GO) [16]. They showed that loops contain structural motifs involved in the functional sites of proteins. Tendulkar et al [17] and Manikandan et al [18] extracted octapeptide clusters involved in protein function They first classified octapeptides using geometric invariants [17] or dihedral angles [18]. The common point between all these studies is that no prior information about the location of the functional sites is required, making it possible to discover new functional sites

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call