Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.
Read full abstract