Abstract

Which variables determine the constraints on gene sequence evolution is one of the most central questions in molecular evolution. In the fission yeast Schizosaccharomyces pombe, an important model organism, the variables influencing the rate of sequence evolution have yet to be determined. Previous studies in other single celled organisms have generally found gene expression levels to be most significant, with numerous other variables such as gene length and functional importance identified as having a smaller impact. Using publicly available data, we used partial least squares regression, principal components regression, and partial correlations to determine the variables most strongly associated with sequence evolution constraints. We identify centrality in the protein–protein interactions network, amino acid composition, and cellular location as the most important determinants of sequence conservation. However, each factor only explains a small amount of variance, and there are numerous variables having a significant or heterogeneous influence. Our models explain more than half of the variance in dN, raising the possibility that future refined models could quantify the role of stochastics in evolutionary rate variation.

Highlights

  • The question of which variables determine the rate of sequence evolution is one of the most central in evolutionary genomics

  • While there is a long list of variables that are believed to influence the rate of sequence evolution, the importance of each has still not been explored in fission yeasts as far as we are aware

  • We propose the additional use of the similar partial least squares regression (PLS), which unlike principal components regression (PCR) reduces dimensionality of dependent variables with respect to both the independent and dependent variables rather than just the dependent variables (Haenlein and Kaplan 2004)

Read more

Summary

Introduction

The question of which variables determine the rate of sequence evolution is one of the most central in evolutionary genomics. While there is a long list of variables that are believed to influence the rate of sequence evolution, the importance of each has still not been explored in fission yeasts as far as we are aware. This has been examined in a range of other organisms, generally showing that gene expression levels most strongly influence sequence constraint at least in single-celled organisms (Zhang and Yang 2015). Kimura and Ohta (1974) suggested, based on the neutral theory of molecular evolution (Kimura 1968) that functional importance (importance of gene for organismal fitness) would be the most important predictor of sequence evolution constraint. Dispensability and essentiality both refer to the effect of gene loss rather than point mutations, which is how sequence change is measured (Pal et al 2006; Alvarez-Ponce 2014)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call