Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root.

Akanksha Pandey,Edward L Braun

doi:10.3390/biology9040064

Abstract

Phylogenomics, the use of large datasets to examine phylogeny, has revolutionized the study of evolutionary relationships. However, genome-scale data have not been able to resolve all relationships in the tree of life; this could reflect, at least in part, the poor-fit of the models used to analyze heterogeneous datasets. Some of the heterogeneity may reflect the different patterns of selection on proteins based on their structures. To test that hypothesis, we developed a pipeline to divide phylogenomic protein datasets into subsets based on secondary structure and relative solvent accessibility. We then tested whether amino acids in different structural environments had distinct signals for the topology of the deepest branches in the metazoan tree. We focused on a dataset that appeared to have a mixture of signals and we found that the most striking difference in phylogenetic signal reflected relative solvent accessibility. Analyses of exposed sites (residues located on the surface of proteins) yielded a tree that placed ctenophores sister to all other animals whereas sites buried inside proteins yielded a tree with a sponge+ctenophore clade. These differences in phylogenetic signal were not ameliorated when we conducted analyses using a set of maximum-likelihood profile mixture models. These models are very similar to the Bayesian CAT model, which has been used in many analyses of deep metazoan phylogeny. In contrast, analyses conducted after recoding amino acids to limit the impact of deviations from compositional stationarity increased the congruence in the estimates of phylogeny for exposed and buried sites; after recoding amino acid trees estimated using the exposed and buried site both supported placement of ctenophores sister to all other animals. Although the central conclusion of our analyses is that sites in different structural environments yield distinct trees when analyzed using models of protein evolution, our amino acid recoding analyses also have implications for metazoan evolution. Specifically, our results add to the evidence that ctenophores are the sister group of all other animals and they further suggest that the placozoa+cnidaria clade found in some other studies deserves more attention. Taken as a whole, these results provide striking evidence that it is necessary to achieve a better understanding of the constraints due to protein structure to improve phylogenetic estimation.

Highlights

The growing availability of very large molecular datasets has transformed the field of phylogenetics.The use of these phylogenomic datasets was suggested to “end incongruence” among phylogenetic estimates by reducing the stochastic error associated with analyses of small datasets [1]
Different structural classes were associated with different phylogenetic signals based on analyses using standard empirical models
The results of analyses using empirical models did not change when using the 20-state general time reversible (GTR) model, which had a better fit to both of the structurally-defined subsets of the filtered Ryan genomic (FRG) data than the LG model despite the large number of free parameters that must be optimized for that model

Summary

Introduction

The growing availability of very large molecular datasets has transformed the field of phylogenetics.The use of these phylogenomic datasets was suggested to “end incongruence” among phylogenetic estimates by reducing the stochastic error associated with analyses of small datasets [1]. Biology 2020, 9, 64 analyses using genome-scale data have produced multiple distinct resolutions of problematic nodes, sometimes with strong support [6,7,8,9,10,11] This suggests that analyses of these large datasets can be misled by non-historical signals that may not be as apparent in analyses of smaller datasets. Cases where support is limited despite the use of large amounts of data could reflect one of two phenomena: (1) the data contains a mixture of signals; or (2) the underlying species tree contains a hard polytomy (and, historical signal is absent). Understanding the distribution of historical and non-historical signal(s) in large-scale data matrices might provide insights into evolutionary processes and result in better understanding of analytical methods and their limitations with phylogenomic datasets

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Biology	Publication Date: Mar 28, 2020
Citations: 26	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biology

Lead the way for us

Similar Papers

Combining prediction of secondary structure and solvent accessibility in proteins
Rafał Adamczak ... Jarosław Meller
Proteins: Structure, Function, and Bioinformatics | VOL. 59
Rafał Adamczak, et. al.Rafał Adamczak ... Jarosław Meller
14 Mar 2005
Proteins: Structure, Function, and Bioinformatics | VOL. 59

SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity.
Christophe N Magnan ... Pierre Baldi
Bioinformatics | VOL. 30
Christophe N Magnan, et. al.Christophe N Magnan ... Pierre Baldi
24 May 2014
Bioinformatics | VOL. 30

Impact of residue accessible surface area on the prediction of protein secondary structures
Amir Momen-Roknabadi ... Mehdi Sadeghi
BMC Bioinformatics | VOL. 9
Amir Momen-Roknabadi, et. al.Amir Momen-Roknabadi ... Mehdi Sadeghi
31 Aug 2008
BMC Bioinformatics | VOL. 9

Phenotypic selection on floral traits in an urban landscape.
Rebecca E Irwin ... Paige S Warren
Proceedings of the Royal Society B: Biological Sciences | VOL. 285
Rebecca E Irwin, et. al.Rebecca E Irwin ... Paige S Warren
15 Aug 2018
Proceedings of the Royal Society B: Biological Sciences | VOL. 285

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biology