Abstract

The engineering of thermostable enzymes is receiving increased attention. The paper, detergent, and biofuel industries, in particular, seek to use environmentally friendly enzymes instead of toxic chlorine chemicals. Enzymes typically function at temperatures below 60°C and denature if exposed to higher temperatures. In contrast, a small portion of enzymes can withstand higher temperatures as a result of various structural adaptations. Understanding the protein attributes that are involved in this adaptation is the first step toward engineering thermostable enzymes. We employed various supervised and unsupervised machine learning algorithms as well as attribute weighting approaches to find amino acid composition attributes that contribute to enzyme thermostability. Specifically, we compared two groups of enzymes: mesostable and thermostable enzymes. Furthermore, a combination of attribute weighting with supervised and unsupervised clustering algorithms was used for prediction and modelling of protein thermostability from amino acid composition properties. Mining a large number of protein sequences (2090) through a variety of machine learning algorithms, which were based on the analysis of more than 800 amino acid attributes, increased the accuracy of this study. Moreover, these models were successful in predicting thermostability from the primary structure of proteins. The results showed that expectation maximization clustering in combination with uncertainly and correlation attribute weighting algorithms can effectively (100%) classify thermostable and mesostable proteins. Seventy per cent of the weighting methods selected Gln content and frequency of hydrophilic residues as the most important protein attributes. On the dipeptide level, the frequency of Asn-Glu was the key factor in distinguishing mesostable from thermostable enzymes. This study demonstrates the feasibility of predicting thermostability irrespective of sequence similarity and will serve as a basis for engineering thermostable enzymes in the laboratory.

Highlights

  • The primary structure of a protein is the most important factor in determining enzyme thermostability

  • The results showed that amino acid composition can be used to efficiently discriminate between different halostable protein groups with up to 98% accuracy

  • We examined a variety of attribute weighting algorithms and various supervised and unsupervised clustering models on a large number (800) of amino acid properties

Read more

Summary

Introduction

The primary structure of a protein is the most important factor in determining enzyme thermostability. This stability can be improved by adjusting external environmental factors including cations, substrates, co-enzymes, and modulators. Considerable attention has been paid to thermostable enzymes. Many industrial applications have been reported for thermostable enzymes because they are more stable and generally better suited to harsh processing conditions [1,2]. Enzymes present in thermophiles are more stable than those found in their mesophilic counterparts. Further research will allow additional exploitation of thermophiles for biotechnology applications. The cloning of enzymes from thermophiles into mesophilic hosts is especially promising. Most currently available thermostable enzymes have been derived from mesophiles

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call