Nonstop extension mutations, a.k.a. stop-lost or stop-loss mutations, convert a stop codon into a sense codon resulting in translation into the 3’ untranslated region until the next in-frame stop codon, thereby extending the C-terminus of a protein. In cancer, only nonstop mutations in SMAD4 have been functionally characterized, while the impact of other nonstop mutations remain unknown. Here, we exploit our pan-cancer NonStopDB dataset and test all 2335 C-terminal extensions arising from somatic nonstop mutations in cancer for their impact on protein expression. In a high-throughput screen, 56.1% of the extensions effectively reduce protein abundance. Extensions of multiple tumor suppressor genes like PTEN, APC, B2M, CASP8, CDKN1B and MLH1 are effective and validated for their suppressive impact. Importantly, the effective extensions possess a higher hydrophobicity than the neutral extensions linking C-terminal hydrophobicity with protein destabilization. Analyzing the proteomes of eleven different species reveals conserved patterns of amino acid distribution in the C-terminal regions of all proteins compared to the proteomes like an enrichment of lysine and arginine and a depletion of glycine, leucine, valine and isoleucine across species and kingdoms. These evolutionary selection patterns are disrupted in the cancer-derived effective nonstop extensions.
Read full abstract