Abstract

The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity.

Highlights

  • Irreversible b-strand driven protein aggregation and amyloidogenesis is a tremendous burden to biological organisms

  • Studies of amyloidogenic proteins have revealed that different protein sequences vary in their propensity to aggregate, which can be attributed to the presence of aggregation-nucleating short sequence stretches, capable of forming the cross-b steric zipper motif, called aggregation prone regions (APRs) [6,7,8,9,10]

  • Mechanistic studies into protein aggregation have revealed that certain sequence regions contribute more to the aggregation propensity of a protein than other sequence regions do

Read more

Summary

Introduction

Irreversible b-strand driven protein aggregation and amyloidogenesis is a tremendous burden to biological organisms. Analyses of APRs indicate common sequence properties including a high preference for b-branched hydrophobic residues, strong b-sheet propensity, low net charge, and in the case of fibril forming patterns, position-specific charged residues [11,12]. Knowledge of these properties has enabled the development of phenomenological and first-principle based methods to predict APRs in any protein sequence [13,14,15,16,17,18,19,20].

Author Summary
Findings
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call