Protein hormones usually act via cell surface receptors linked to intracellular transduction pathways, they seem too complex and energetically expensive to be economical as single-message molecules. Proteolytic fragments of some protein hormones are known from earlier studies to have additional functions. Also, protein hormone translation sequences often contain multiple secreted products, e.g., poorly characterized pro-proteins or contraposed antagonistic hormones. If telescoped secondary functions are released during proteolytic processing at the synthetic cell, in circulation, or at target cells, organisms gain the efficiencies needed to use proteins as common signals. We have compiled a catalog of 2011 known soluble human protein hormone transcripts or transcript products and have now mapped them for their: predicted cleavages (PROSPER, Monash University) by 24 known proteases from 4 protease families (class: name and MEROPS number - Aspartate: HIV 1 retropepsin A02.001; Cysteine: cathepsin K C01.036, calpain 1 C02.001, caspase 1 C14.001, caspase 3 C14.003, caspase 7 C14.004, caspase 6 C14.005, caspase 8 C14.009; Metalloprotease: MMP 2 M10.003, MMP 9 M10.004, MMP 3 M10.005, MMP 7 M10.008; Serine: chymotrypsin A (bovine) S01.001, granzyme B (human) S01.010, elastase 2 S01.131, cathepsin G S01.133, granzyme B (mouse) S01.136, thrombin S01.217, plasmin S01.233, glutamyl peptidase I S01.269, furin S08.071, signal peptidase I S26.001, thylakoidal processing peptidase S26.008, signalase S26.010); the predicted secondary structures of the 887 unique transcripts; the known locations of the exon boundaries for the 459 canonical (Havanna annotation) transcripts; and the multiple alignment of the canonical transcripts. After exposure to all 24 proteases, 100% of the canonical transcripts still have left 8 +/- 7 (range 1 – 56) residual peptides of >10 amino acids long (M +/- SD, 18 +/- 8 residues; range, 10 – 67); only 1.19% of total fragments include single amino acid repeats of >4 residues. Although the cleavage prediction algorithm considers secondary structure, solvent access and surface charge as well as primary amino acid sequence, cleavage patterns are retained across multiple transcript isoforms and known bioactive transcript fragments down to peptides of 10-20 residues. Co-alignments of the proteolytic map, the secondary structure map, and the exon boundary map demonstrate a high propensity for overlap of these features including a 4.4-fold higher predicted proteolytic cleavage rate (% possible residues; p <0.01) within 3 residues of the exon boundaries versus sites >3 residues from the boundaries. The results suggest evolutionary retention of cleavage patterns allowing organismal access to secondary structures or functions, including nested secondary hormonal signals, encoded by single exons and obviating the need to preserve archaic individual exon genes.
Read full abstract