Abstract

This article examines the presence of the empirical tendency known as the Menzerath–Altmann Law (MAL) on protein secondary structures. MAL is related to optimization principles observed in natural languages and in genetic information on chromosomes or protein domains. The presence of MAL is examined on a non-redundant dataset of 4728 proteins by verifying significant, negative correlations and testing classical and newly proposed formulas by fitting the observed trend. We conclude that the lengths of secondary structures are specifically dependent on their number inside the protein sequence, while possibly reflecting the formula proposed in this paper. This behavior is observed on average but is individually avoidable and possibly driven by a latent cost function. The data suggest that MAL could provide a useful guiding principle in protein design.

Highlights

  • The Menzerath–Altmann law (MAL) is a specific empirical relation holding between the average lengths of so-called components and their constructs

  • This relation was first observed on natural languages [1,2], where we find the longer words are on average, the shorter are the syllables, yielding an inverse trend relation that can be described by a specific mathematical formula

  • The results show the average lengths of the α-helix and β-sheet secondary structures measured in a number of amino acids are related to their count inside a protein and that the relation can be described by a specific mathematical formula listed as (5)

Read more

Summary

Introduction

The Menzerath–Altmann law (MAL) is a specific empirical relation holding between the average lengths of so-called components and their constructs. The purpose of this work is to assess the presence of the MAL on the secondary structures of proteins, i.e., to study whether and how the average lengths of α-helices and β-sheets (measured in the number of amino acids) are dependent on their count inside the proteins and what formula can describe this relation. This has not been yet studied; findings may provide information on protein design, protein evolution, protein pathology and/or protein model assessment

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.