Abstract

Amino acids show apparent propensities toward their neighbors. In addition to preferences of amino acids for their neighborhood context, amino acid substitutions are also considered to be context-dependent. However, context-dependence patterns of amino acid substitutions still remain poorly understood. Using relative entropy, we investigated the neighbor preferences of 20 amino acids and the context-dependent effects of amino acid substitutions with protein sequences in human, mouse, and dog. For 20 amino acids, the highest relative entropy was mostly observed at the nearest adjacent site of either N- or C-terminus except C and G. C showed the highest relative entropy at the third flanking site and periodic pattern was detected at G flanking sites. Furthermore, neighbor preference patterns of amino acids varied greatly in different secondary structures. We then comprehensively investigated the context-dependent effects of amino acid substitutions. Our results showed that nearly half of 380 substitution types were evidently context dependent, and the context-dependent patterns relied on protein secondary structures. Among 20 amino acids, P elicited the greatest effect on amino acid substitutions. The underlying mechanisms of context-dependent effects of amino acid substitutions were possibly mutation bias at a DNA level and natural selection. Our findings may improve secondary structure prediction algorithms and protein design; moreover, this study provided useful information to develop empirical models of protein evolution that consider dependence between residues.

Highlights

  • Amino acid sequences are necessary to allow proteins to fold into their native conformations [1].As such, protein sequence patterns should be characterized to understand protein structure, function, and stability

  • We aimed to investigate the neighbor preference patterns of 20 amino acids and the context-dependent effects of amino acid substitutions

  • Our results showed that nearly half of the 380 amino acid substitution types were remarkably context dependent, and the highest relative entropies were mainly observed at the two nearest flanking sites (Figure 5A; Figure S5)

Read more

Summary

Introduction

Amino acid sequences are necessary to allow proteins to fold into their native conformations [1].As such, protein sequence patterns should be characterized to understand protein structure, function, and stability. In addition to amino acid preferences for different secondary structures, preferences for particular residue pairs in protein sequences have been discovered. These preferred residue pairings are found in α-helices [6,7,8,9,10,11], parallel/antiparallel β-sheets [12,13,14], loops [15], and protein inter-domain linkers [16]. Such residue pairs are related to secondary structure formation and protein stabilization. This research may provide new insights into neighbor preferences of amino acids; this study may improve secondary structure prediction algorithms and protein design

Objectives
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.