Large-scale experimental analyses find ever more abundant evidence of translation from start codons upstream of the canonical start site. This translation either generates entirely new proteins (from novel upstream open reading frames) or produces isoforms with extended N-terminals when the novel start codon is in frame Most extended N-terminals are likely to just add a disordered region to the canonical protein isoform, but some may also block the recognition of the signal peptide causing the isoform to accumulate in the incorrect cellular compartment. This analysis finds evidence that upstream translations that would interfere with signal peptides are detected in expected quantities in ribosome profiling experiments, but that the equivalent N-terminally extended protein isoforms are significantly reduced in multiple proteomics experiments. This suggests that these isoforms are likely to be degraded shortly after translation by the ubiquitination pathway, thus preventing the build up of potentially harmful proteins with hydrophobic regions in the cytoplasm. In addition, this is further evidence that most of the transcripts translated from upstream start sites are the result of an inefficient translation initiation process. This has implications for the annotation of proteins given the huge numbers of upstream translations that are being detected in large-scale experiments.
Read full abstract