Abstract
BackgroundIn protein evolution, the mechanism of the emergence of novel protein domain is still an open question. The incremental growth of protein variable regions, which was produced by stochastic insertions, has the potential to generate large and complex sub-structures. In this study, a deterministic methodology is proposed to reconstruct phylogenies from protein structures, and to infer insertion events in protein evolution. The analysis was performed on a broad range of SCOP domain families.ResultsPhylogenies were reconstructed from protein 3D structural data. The phylogenetic trees were used to infer ancestral structures with a consensus method. From these ancestral reconstructions, 42.7% of the observed insertions are nested insertions, which locate in previous insert regions. The average size of inserts tends to increase with the insert rank or total number of insertions in the variable regions. We found that the structures of some nested inserts show complex or even domain-like fold patterns with helices, strands and loops. Furthermore, a basal level of structural innovation was found in inserts which displayed a significant structural similarity exclusively to themselves. The β-Lactamase/D-ala carboxypeptidase domain family is provided as an example to illustrate the inference of insertion events, and how the incremental growth of a variable region is capable to generate novel structural patterns.ConclusionUsing 3D data, we proposed a method to reconstruct phylogenies. We applied the method to reconstruct the sequences of insertion events leading to the emergence of potentially novel structural elements within existing protein domains. The results suggest that structural innovation is possible via the stochastic process of insertions and rapid evolution within variable regions where inserts tend to be nested. We also demonstrate that the structure-based phylogeny enables the study of new questions relating to the evolution of protein domain and biological function.
Highlights
In protein evolution, the mechanism of the emergence of novel protein domain is still an open question
The results indicate that domains from Structural Classification of Proteins (SCOP) class α/β have more variable regions on average than those from other three classes
Several works have utilized structure-based methods to study these SCOP families with low sequence homology, including the β-Lactamase/D-ala carboxypeptidase family [19], the Class II aminoacyl-tRNA synthetase-like family [22], and the short-chain alcohol dehydrogenases family [23]
Summary
The mechanism of the emergence of novel protein domain is still an open question. The incremental growth of protein variable regions, which was produced by stochastic insertions, has the potential to generate large and complex sub-structures. The mechanism by which new structures emerge or evolve from existing proteins is still an open question. Ancient domain families show bias towards insertions in the variable region which grow in size [5]. A succession of insertions and rapid evolution appears to be a reasonable process that could lead to the emergence of novel protein architectures [2,6,7]. Determining the extent and the mechanism of emergence of protein structure is difficult because observations are limited to extant protein folds. The evolution of sub-structures over time must be inferred from this limited information
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.