Abstract

We present a method called DAVROS to detect, localize, and validate repeating motifs in protein structure allowing for insertions and deletions. DAVROS uses the score matrix from a structural alignment program (SAP) to search for repeating motifs using an algorithm based on concepts from signal processing and the statistical properties of the alignments. The method was tested against a nonredundant Protein Data Bank, and each chain was assigned a score. For the top 50 chains ranked by score, 70% contain repeating motifs detected without error. These represent 14 types of fold covering alpha, beta, and alphabeta protein classes. A second data set comprising protein chains in different sequence families for triosephosphate isomerase (TIM) barrel, leucine-rich repeat (LRR), trefoil, and alpha-alpha barrel folds was used to assess the ability of DAVROS to detect all motifs within a specific fold. For the second test set, the percentage of motifs detected was highest for the LRR chains (88.7%) and least for the TIM barrels (60%). This variability results from the regularity of the LRR motif compared to the alphabeta units of the TIM barrel, which generally have many more indels. These reduce the strength of the repeat signal in the SAP matrix, making repeat detection more difficult.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call