Abstract

The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet.

Highlights

  • The implications of understanding the properties and organization of the sequence-structure map of proteins are broad, they range from explaining the diversity of known protein folds in the context of cellular physiology and their evolution [1], synthesize molecules of biomedical or industrial interest [2], to engineer polymers [3] and proteomes de novo.From an evolutionary standpoint the relation between sequence and structure is a particular case of a more general problem known as the genotype-phenotype map (GP map) [4]

  • I compare these observations to the types of interactions observed in natural amino acids

  • In order to explore the impact of the potential on the architecture of the sequence-structure map of natural proteins, I concentrate on the L18 model and binary alphabets

Read more

Summary

Introduction

The implications of understanding the properties and organization of the sequence-structure map of proteins are broad, they range from explaining the diversity of known protein folds in the context of cellular physiology and their evolution [1], synthesize molecules of biomedical or industrial interest [2], to engineer polymers [3] and proteomes de novo. From an evolutionary standpoint the relation between sequence and structure is a particular case of a more general problem known as the genotype-phenotype map (GP map) [4]. According to the GP map framework, protein sequences correspond to genotypes and structures to phenotypes [5]. A graph theoretic representation of genotype space provides a quantitative, unifying framework to explore different properties of the sequence-structure relation, while considering these properties on a broader evolutionary perspective. I refer to this detailed characterization of the sequence-structure map, as its architecture

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.