Abstract
Enhancing proteins' thermostability is an important aspect of enzyme engineering. Many studies have investigated the properties that determine the proteins' thermostability. However, no consensus has emerged. To understand the mechanisms underlying the high thermostability of thermophilic proteins, we evaluated the relative importance of the amino acid frequencies in protein sequences for discriminating thermophilic and non-thermophilic proteins based on machine learning algorithms together with a three-step feature selection procedure and a principal component (PC) analysis to remove noisy and redundant information. Our results showed that the frequencies of oppositely charged amino acids, i.e., Lys and Glu, were higher in thermophilic proteins, suggesting that electrostatic interactions are fundamentally important for protein stabilization at high temperatures. Further, we found that the frequencies of uncharged polar amino acids, which are thermolabile or actively interact with water molecules, were lower in thermophilic proteins. Moreover, the frequencies of β-branched aliphatic amino acids tended to increase with increasing thermostability. Overall, these results suggest that proteins' thermostability is determined by a few protein features, which were well captured by the first two PCs. A classifier based on only the first two PCs achieved a high accuracy of 90%, suggesting that our classifier could be an effective and efficient tool for engineering stable proteins.
Highlights
Proteins are important biocatalysts; most of them are unstable at high temperatures, severely curtailing their applications in the chemical industry [1]
14 amino acids were selected as the relevant amino acids for discriminating thermophilic and non-thermophilic proteins (Fig. 2 b)
Our further PCA analyses showed that even based on only the first two principal component (PC) that were derived from the 14 relevant amino acids, 90% accuracy was achieved (Table 1), suggesting that there was a lot of redundant information in the relevant amino acids and the proteins’ thermostability was governed by a small number of amino acid properties
Summary
Proteins are important biocatalysts; most of them are unstable at high temperatures, severely curtailing their applications in the chemical industry [1]. Many efforts have been devoted to enhancing proteins’ thermostability Thermophilic organisms, such as Thermus aquaticus, produce proteins that can tolerate high temperatures even up to 120 ◦C [2]. Argos et al [4] and Haney et al [5] found that replacing certain amino acids, such as Gly and Ser, with Glu can increase the proteins’ capacity to tolerate high temperatures. These results suggest that the existence of Glu may increase proteins’ thermostability [4], [5]. Perl et al [7] reported that changing Glu to Arg or Leu could transform a mesophilic protein into a thermophilic protein
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.