Abstract

The vast majority of theoretically possible polypeptide chains do not fold, let alone confer function. Hence, protein evolution from preexisting building blocks has clear potential advantages over ab initio emergence from random sequences. In support of this view, sequence similarities between different proteins is generally indicative of common ancestry, and we collectively refer to such homologous sequences as “themes.” At the domain level, sequence homology is routinely detected. However, short themes which are segments, or fragments of intact domains, are particularly interesting because they may provide hints about the emergence of domains, as opposed to divergence of preexisting domains, or their mixing-and-matching to form multi-domain proteins. Here we identified 525 representative short themes, comprising 20–80 residues that are unexpectedly shared between domains considered to have emerged independently. Among these “bridging themes” are ones shared between the most ancient domains, for example, Rossmann, P-loop NTPase, TIM-barrel, flavodoxin, and ferredoxin-like. We elaborate on several particularly interesting cases, where the bridging themes mediate ligand binding. Ligand binding may have contributed to the stability and the plasticity of these building blocks, and to their ability to invade preexisting domains or serve as starting points for completely new domains.

Highlights

  • Over the course of 3.7 billion years of protein evolution protein segments of varying lengths mutated, duplicated, and recombined [1,2,3,4,5]

  • We identify themes shared between non-homologous protein domains – cases where similar protein segments are found in two different contexts

  • The remaining levels of the hierarchy, from the X level downwards group domains based on presumed common ancestry

Read more

Summary

Introduction

Over the course of 3.7 billion years of protein evolution protein segments of varying lengths mutated, duplicated, and recombined [1,2,3,4,5]. A likely scenario is that they evolved by duplication and fusion of short polypeptides with at least marginal stability, and weak biological functionality, sufficient for their preference over random alternatives. By mining protein databases [6,7,8,9], one can computationally search for traces of the evolutionary events that shaped the current protein universe, such as mutations, duplications, and recombinations of short protein segments (e.g., [4, 10,11,12,13,14,15]). Because sampling a specific sequence (even as short as a few dozen residues) from the vast number of possible sequences is an extremely low probability event, when sequence segments of sufficient similarity are detected, common ancestry (homology) is the more likely scenario [16].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call