Abstract
Improved computational modeling of protein translation rates, including better prediction of where translational slowdowns along an mRNA sequence may occur, is critical for understanding co-translational folding. Because codons within a synonymous codon group are translated at different rates, many computational translation models rely on analyzing synonymous codons. Some models rely on genome-wide codon usage bias (CUB), believing that globally rare and common codons are the most informative of slow and fast translation, respectively. Others use the CUB observed only in highly expressed genes, which should be under selective pressure to be translated efficiently (and whose CUB may therefore be more indicative of translation rates). No prior work has analyzed these models for their ability to predict translational slowdowns. Here, we evaluate five models for their association with slowly translated positions as denoted by two independent ribosome footprint (RFP) count experiments from S. cerevisiae, because RFP data is often considered as a "ground truth" for translation rates across mRNA sequences. We show that all five considered models strongly associate with the RFP data and therefore have potential for estimating translational slowdowns. However, we also show that there is a weak correlation between RFP counts for the same genes originating from independent experiments, even when their experimental conditions are similar. This raises concerns about the efficacy of using current RFP experimental data for estimating translation rates and highlights a potential advantage of using computational models to understand translation rates instead.
Highlights
In the section Specific codons appear to be “slow” we examine how each of the different forms of codon usage bias (CUB measures) relates to ribosome footprint (RFP)-implied slow codons from 14 GWIPS-vis data sets that use cycloheximide (CHX) to freeze the ribosomes
When determining the codon window size to consider with our computational codon usage models, all instances of the classifier find that a window size between eight and 10 ( the windows (-4, +3), (-5, +3), and (-5, +4)) are the most predictive of RFP counts
Because the computational models tested in this study rely on sequence sliding windows, we use a proof-of-concept classifier to determine which window is most predictive of RFP count data, and a better proxy for translation tempo
Summary
A better understanding of the dynamics of protein translation (i.e., translation rates of ribosomes at specific codon positions along mRNA sequences) has many biological applications, such as enabling better understanding of co-translational protein folding and aiding in gene. Codon usage models’ association with translationally slow codons. 1R01GM120733 awarded to SE, TM, PC, and JL. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have