Abstract

Despite the long history of using protein sequences to infer the tree of life, the potential for different parts of protein structures to retain historical signal remains unclear. We propose that it might be possible to improve analyses of phylogenomic datasets by incorporating information about protein structure. We test this idea using the position of the root of Metazoa (animals) as a model system. We examined the distribution of “strongly decisive” sites (alignment positions that support a specific tree topology) in a dataset comprising >1500 proteins and almost 100 taxa. The proportion of each class of strongly decisive sites in different structural environments was very sensitive to the model used to analyze the data when a limited number of taxa were used but they were stable when taxa were added. As long as enough taxa were analyzed, sites in all structural environments supported the same topology regardless of whether standard tree searches or decisive sites were used to select the optimal tree. However, the use of decisive sites revealed a difference between the support for minority topologies for sites in different structural environments: buried sites and sites in sheet and coil environments exhibited equal support for the minority topologies, whereas solvent-exposed and helix sites had unequal numbers of sites, supporting the minority topologies. This suggests that the relatively slowly evolving buried, sheet, and coil sites are giving an accurate picture of the true species tree and the amount of conflict among gene trees. Taken as a whole, this study indicates that phylogenetic analyses using sites in different structural environments can yield different topologies for the deepest branches in the animal tree of life and that analyzing larger numbers of taxa eliminates this conflict. More broadly, our results highlight the desirability of incorporating information about protein structure into phylogenomic analyses.

Highlights

  • The relationship between protein structure and the patterns of sequence evolution has been a topic of interest since the very dawn of molecular evolution as a field [1,2] and, despite the very limited amount of data available at the time of those early studies, many of those early hypotheses have stood the test of time

  • Because we used the strongly decisive site criterion, we believed it was important to establish that it yields conclusions similar to those for standard tree searches

  • Tree searches using RAxML resulted in topology T2 with 100% bootstrap support in all cases, indicating that using the proportion of decisive sites yields conclusions that are similar to those of tree searches

Read more

Summary

Introduction

The relationship between protein structure and the patterns of sequence evolution has been a topic of interest since the very dawn of molecular evolution as a field [1,2] and, despite the very limited amount of data available at the time of those early studies, many of those early hypotheses have stood the test of time (reviewed by Alvarez-Ponce [3]). Our understanding of the factors that determines rates of amino acid change have certainly expanded since those pioneering studies [4,5], but two important constants have been the idea that the buried residues in globular proteins evolve more slowly [6,7,8] and are more hydrophobic than solvent-exposed residues [8,9]. Our ability to understand relationships among organisms was revolutionized by the rapid accumulation of transcriptome and whole genome sequence data.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.