Abstract

Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins.

Highlights

  • In the field of industrial and biotechnology development, researchers usually increase the temperature to shorten the enzymatic reaction time (Tang et al, 2017)

  • It showed that amino acid composition (AAC), traditional pseudo amino acid composition (tPseAAC), amphiphilic pseudo amino acid composition (aPseAAC), dipeptide composition (DC), deviation from the expected mean (DDE), composition of k-spaced amino acid pairs (CKSAAP), and CTD produced the best AUC of 0.9735, 0.9580, 0.9610, 0.9143, 0.9165, 0.8349, and 0.9644, respectively

  • The performance of each descriptor increased after the feature selection except the CTD descriptor; we considered all features of CTD in our study

Read more

Summary

Introduction

In the field of industrial and biotechnology development, researchers usually increase the temperature to shorten the enzymatic reaction time (Tang et al, 2017). The increase in temperature leads to the denaturation of protein, resulting in the loss of protein activity. Maintaining the activity of protein under increasing temperature conditions is a hot topic in the current engineering field. It is well known that temperature is crucial to cellular life. It has been reported that some organisms can live in a high-temperature environment. The organisms that survive at an optimal growth temperature (OGT) below 50◦C are regarded as mesophilic organisms, and the organisms that can survive at the OGT of 50◦C or above are called thermophilic organisms (Gromiha and Suresh, 2008).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call