Abstract

Proteins with low-complexity domains continue to emerge as key players in both normal and pathological cellular processes. Although low-complexity domains are often grouped into a single class, individual low-complexity domains can differ substantially with respect to amino acid composition. These differences may strongly influence the physical properties, cellular regulation, and molecular functions of low-complexity domains. Therefore, we developed a bioinformatic approach to explore relationships between amino acid composition, protein metabolism, and protein function. We find that local compositional enrichment within protein sequences is associated with differences in translation efficiency, abundance, half-life, protein-protein interaction promiscuity, subcellular localization, and molecular functions of proteins on a proteome-wide scale. However, local enrichment of related amino acids is sometimes associated with opposite effects on protein regulation and function, highlighting the importance of distinguishing between different types of low-complexity domains. Furthermore, many of these effects are discernible at amino acid compositions below those required for classification as low-complexity or statistically-biased by traditional methods and in the absence of homopolymeric amino acid repeats, indicating that thresholds employed by classical methods may not reflect biologically relevant criteria. Application of our analyses to composition-driven processes, such as the formation of membraneless organelles, reveals distinct composition profiles even for closely related organelles. Collectively, these results provide a unique perspective and detailed insights into relationships between amino acid composition, protein metabolism, and protein functions.

Highlights

  • Low-complexity domains in protein sequences are regions that are composed of only a few amino acids in the protein “alphabet”

  • We find that high local composition of individual amino acids is associated with pervasive effects on protein metabolism, subcellular localization, and molecular function on a proteome-wide scale

  • Our results provide a coherent view and unprecedented resolution of the effects of local amino acid enrichment on protein biology

Read more

Summary

Introduction

Low-complexity domains (LCDs) in proteins are regions enriched in only a subset of possible amino acids. LCD boundaries are later extended and refined by merging overlapping LCDs and calculating combinatorial sequence probabilities Another metric commonly used to assess relative sequence complexity is compositional bias, which involves determining the statistical probability of a sequence given whole-proteome frequencies of the individual amino acids [11,12]. These approaches (or closely-related approaches) have been used extensively to examine LCDs on a proteome-wide scale [1,3,12,13,14,15,16,17]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.