Abstract
The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity.
Highlights
Irreversible b-strand driven protein aggregation and amyloidogenesis is a tremendous burden to biological organisms
Studies of amyloidogenic proteins have revealed that different protein sequences vary in their propensity to aggregate, which can be attributed to the presence of aggregation-nucleating short sequence stretches, capable of forming the cross-b steric zipper motif, called aggregation prone regions (APRs) [6,7,8,9,10]
Mechanistic studies into protein aggregation have revealed that certain sequence regions contribute more to the aggregation propensity of a protein than other sequence regions do
Summary
Irreversible b-strand driven protein aggregation and amyloidogenesis is a tremendous burden to biological organisms. Analyses of APRs indicate common sequence properties including a high preference for b-branched hydrophobic residues, strong b-sheet propensity, low net charge, and in the case of fibril forming patterns, position-specific charged residues [11,12]. Knowledge of these properties has enabled the development of phenomenological and first-principle based methods to predict APRs in any protein sequence [13,14,15,16,17,18,19,20].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.