Abstract

As sequence and structure comparison algorithms gain sensitivity, the intrinsic interconnectedness of the protein universe has become increasingly apparent. Despite this general trend, β-trefoils have emerged as an uncommon counterexample: They are an isolated protein lineage for which few, if any, sequence or structure associations to other lineages have been identified. If β-trefoils are, in fact, remote islands in sequence-structure space, it implies that the oligomerizing peptide that founded the β-trefoil lineage itself arose de novo. To better understand β-trefoil evolution, and to probe the limits of fragment sharing across the protein universe, we identified both ‘β-trefoil bridging themes’ (evolutionarily-related sequence segments) and ‘β-trefoil-like motifs’ (structure motifs with a hallmark feature of the β-trefoil architecture) in multiple, ostensibly unrelated, protein lineages. The success of the present approach stems, in part, from considering β-trefoil sequence segments or structure motifs rather than the β-trefoil architecture as a whole, as has been done previously. The newly uncovered inter-lineage connections presented here suggest a novel hypothesis about the origins of the β-trefoil fold itself–namely, that it is a derived fold formed by ‘budding’ from an Immunoglobulin-like β-sandwich protein. These results demonstrate how the evolution of a folded domain from a peptide need not be a signature of antiquity and underpin an emerging truth: few protein lineages escape nature’s sewing table.

Highlights

  • Proteins are often approximated as discrete domains from distinct evolutionary lineages

  • We report a systematic search for β-trefoil-like sequence segments and structure motifs (β-trefoil-like motifs) across the known protein universe, foremost to understand the evolutionary history of β-trefoils and to search for an example of the ‘patchwork model’ of protein evolution

  • Our results demonstrate that the β-trefoil is nowhere near as isolated as previously thought; instead, β-trefoil proteins appear to be a reservoir for sequence innovation in other protein lineages and vice versa

Read more

Summary

Introduction

Proteins are often approximated as discrete domains from distinct evolutionary lineages Such classification is powerful, foremost because it can serve as a framework for understanding how proteins within a family change over time, and because it naturally lends itself to naming and aids in communication (as is true for taxonomy in general). Foremost because it can serve as a framework for understanding how proteins within a family change over time, and because it naturally lends itself to naming and aids in communication (as is true for taxonomy in general) This notion of separability is, an approximation: Regardless of whether structure [1,2], sequence [3–6], or both structure and sequence [7,8] are considered, similar segments between proteins that lack global sequence identity are detectable and common. When these segments are highly conserved within diverse protein families, or overlap with important functional sites, they can provide insight into early protein evolution [8,10] and reveal distant evolutionary relationships [11]

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call