Classifying Residues in Mechanically Stable and Unstable Substructures Based on a Protein Sequence: The Case Study of the DnaK Hsp70 Chaperone.

Michal Gala,Gabriel Žoldák

doi:10.3390/nano11092198

Michal Gala, Gabriel Žoldák

Open Access

https://doi.org/10.3390/nano11092198

Copy DOI

Abstract

Artificial proteins can be constructed from stable substructures, whose stability is encoded in their protein sequence. Identifying stable protein substructures experimentally is the only available option at the moment because no suitable method exists to extract this information from a protein sequence. In previous research, we examined the mechanics of E. coli Hsp70 and found four mechanically stable (S class) and three unstable substructures (U class). Of the total 603 residues in the folded domains of Hsp70, 234 residues belong to one of four mechanically stable substructures, and 369 residues belong to one of three unstable substructures. Here our goal is to develop a machine learning model to categorize Hsp70 residues using sequence information. We applied three supervised methods: logistic regression (LR), random forest, and support vector machine. The LR method showed the highest accuracy, 0.925, to predict the correct class of a particular residue only when context-dependent physico-chemical features were included. The cross-validation of the LR model yielded a prediction accuracy of 0.879 and revealed that most of the misclassified residues lie at the borders between substructures. We foresee machine learning models being used to identify stable substructures as candidates for building blocks to engineer new proteins.

Highlights

We develop a machine learning model that utilizes protein sequence information, which can classify residues in mechanically stable and unstable substructures
The best performance was achieved with a logistic regression, which showed the highest accuracy, 0.922, and a high Cohen’s kappa parameter, 0.85
We were not able to develop an accurate machine learning model employing one-hot encoding, which indicates that the physico-chemical information encoded in amino acids is crucial

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Stable protein super-assemblies have recently been designed and engineered to form functional nanodevices such as nano-cages for therapeutic applications [1,2,3,4]. To increase the number and the complexity of these super-assemblies, mechanically stable building blocks are prerequisites. The stability and structure of the building blocks are fully encoded in their protein sequence. Short sequences can form different structures of different stabilities that are impacted by the presence of other folded substructures, which suggests a long-range contextual dependence

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nanomaterials	Publication Date: Aug 26, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Classifying Residues in Mechanically Stable and Unstable Substructures Based on a Protein Sequence: The Case Study of the DnaK Hsp70 Chaperone.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nanomaterials

Lead the way for us

Similar Papers

Machine Learning Model to Stratify the Risk of Lymph Node Metastasis for Early Gastric Cancer: A Single-Center Cohort Study.
Ji-Eun Na ... Hong-Hee Won
Cancers | VOL. 14
Ji-Eun Na, et. al.Ji-Eun Na ... Hong-Hee Won
22 Feb 2022
Cancers | VOL. 14

A comparison of machine learning models for predicting urinary incontinence in men with localized prostate cancer.
Hajar Hasannejadasl ... Rianne R R Fijten
Frontiers in oncology | VOL. 13
Hajar Hasannejadasl, et. al.Hajar Hasannejadasl ... Rianne R R Fijten
12 Apr 2023
Frontiers in oncology | VOL. 13

Predicting in-hospital mortality in ICU patients with sepsis using gradient boosting decision tree.
Ke Li ... Siru Liu
Medicine | VOL. 100
Ke Li, et. al.Ke Li ... Siru Liu
14 May 2021
Medicine | VOL. 100

Machine Learning for Predictive Analysis of Otolaryngology Residency Letters of Recommendation.
Vikram Vasan ... Marita S Teng
The Laryngoscope | VOL. 134
Vikram Vasan, et. al.Vikram Vasan ... Marita S Teng
11 Apr 2024
The Laryngoscope | VOL. 134

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classifying Residues in Mechanically Stable and Unstable Substructures Based on a Protein Sequence: The Case Study of the DnaK Hsp70 Chaperone.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nanomaterials