Abstract
The degree to which codon usage can be explained by tRNA abundance in bacterial species is often inadequate, partly because differential tRNA abundance is often approximated by tRNA copy numbers. To better understand the coevolution between tRNA abundance and codon usage, we provide a better estimate of tRNA abundance by profiling tRNA mapped reads (tRNA tpm) using publicly available RNA Sequencing data. To emphasize the feasibility of our approach, we demonstrate that tRNA tpm is consistent with tRNA abundances derived from RNA fingerprinting experiments in Escherichia coli, Bacillus subtilis, and Salmonella enterica. Furthermore, we do not observe an appreciable reduction in tRNA sequencing efficiency due to post-transcriptional methylations in the seven bacteria studied. To determine optimal codons, we calculate codon usage in highly and lowly expressed genes determined by protein per transcript. We found that tRNA tpm is sensitive to identify more translationally optimal codons than gene copy number and early tRNA fingerprinting abundances. Additionally, tRNA tpm improves the predictive power of tRNA adaptation index over codon preference. Our results suggest that dependence of codon usage on tRNA availability is not always associated with species growth-rate. Conversely, tRNA availability is better optimized to codon usage in fast-growing than slow-growing species.
Highlights
Codon optimization is critical to researchers seeking to improve protein production
To demonstrate the fidelity of tRNA tpm in bacteria, we compared these values with RNA fingerprinting abundances (Fig. 1, Supplementary File S2) previously reported in E. coli[10,42], S. enterica[10], and B. subtilis[11]
One may reason that, if a bacteriophage encodes many tRNA genes in its own genome, especially when these tRNAs are www.nature.com/scientificreports rare in the host, the phage codon usage will be less dependent on the host tRNA pool
Summary
Codon optimization is critical to researchers seeking to improve protein production. Early experimental studies have shown that replacing rare codons with optimal ones increases protein yields in Escherichia coli[1,2]. A number of codon usage indices use tRNA gene copy as proxy of tRNA abundance to identify translationally optimal codons. We have recently developed a new tool for processing RNA-Seq data, ARSDA27, that stores identical reads as single entries to drastically reduce data storage and computation time for analyzing large RNA-Seq datasets relative to previous methods[28,29] These species were selected because their protein abundance data are available in PaxDb30, their growth rates are described on the basis of generation time (bacteria with >2.5 hour generation times are considered slow growing and all those with lower generation times are fast growing)[24], and their RNA-Seq data are available (GEO Datasets)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.