Abstract

Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.

Highlights

  • An extremophile is an organism that thrives in, and may even require, physically or geochemically extreme conditions that are detrimental to the majority of life on Earth

  • Feature Selection When feature selection model was applied on dataset of protein features to compare halo-tolerant with halosensitive proteins (T/S groups), 513 of 851 features ranked as important (p > 0.95) implying to contribute to stability to stand in harsh conditions and 51 features were found to be marginal (0.90 < p > 0.95)

  • A node was generated with just important features and used whenever it was necessary to run all other models on feature selection dataset

Read more

Summary

Introduction

An extremophile is an organism that thrives in, and may even require, physically or geochemically extreme conditions that are detrimental to the majority of life on Earth. The enzymes from extremely halophilic organisms represent a fascinating example of adaptation because they can per-form their functions in vivo and in vitro at 4-5 M NaCl, losing activity rapidly when exposed to low salt concentrations [2]. Genes for a number of halophilic enzymes have been cloned, including dihydrofolate reductase from Haloferax volcanii [3], glutamate dehydrogenase from Halobacterium salinarum [4], potential as biocatalysts in applications requiring low water activity such as reactions with high salt or organic solvent concentrations [12]. An important question in the field is to determine the protein features that are critical for halostability. To address this question, we can use an extremophile enzyme such as halolysin as a mod-el

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.