Abstract
Protein crystallization is crucial for biology, but the steps involved are complex and demanding in terms of external factors and internal structure. To save on experimental costs and time, the tendency of proteins to crystallize can be initially determined and screened by modeling. As a result, this study created a new pipeline aimed at using protein sequence to predict protein crystallization propensity in the protein material production stage, purification stage and production of crystal stage. The newly created pipeline proposed a new feature selection method, which involves combining Chi-square (${\chi }^{2}$) and recursive feature elimination together with the 12 selected features, followed by a linear discriminant analysisfor dimensionality reduction and finally, a support vector machine algorithm with hyperparameter tuning and 10-fold cross-validation is used to train the model and test the results. This new pipeline has been tested on three different datasets, and the accuracy rates are higher than the existing pipelines. In conclusion, our model provides a new solution to predict multistage protein crystallization propensity which is a big challenge in computational biology.
Full Text
Topics from this Paper
Protein Crystallization Propensity
Recursive Feature Elimination
Support Vector Machine Algorithm
Hyperparameter Tuning
Feature Selection Method
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Rheumatology (Oxford, England)
Jan 30, 2022
Rheumatology
Jan 30, 2022
Forests
Feb 13, 2021
Briefings in Bioinformatics
Aug 29, 2022
International Journal of Advances in Intelligent Informatics
Jul 12, 2020
The clinical respiratory journal
Jun 28, 2023
European Heart Journal
Oct 1, 2021
Bioengineering
Dec 24, 2022
Indian Journal Of Science And Technology
Mar 12, 2023
Journal of Computational Science
Nov 1, 2022
BioMed Research International
Aug 30, 2018
European Radiology
Mar 28, 2020
Plant Methods
Jun 16, 2023
Dec 12, 2011
Briefings in bioinformatics
Briefings in bioinformatics
Nov 22, 2023
Briefings in bioinformatics
Nov 22, 2023
Briefings in bioinformatics
Nov 22, 2023
Briefings in bioinformatics
Nov 22, 2023
Briefings in bioinformatics
Nov 22, 2023
Briefings in bioinformatics
Sep 22, 2023
Briefings in bioinformatics
Sep 22, 2023
Briefings in bioinformatics
Sep 22, 2023
Briefings in bioinformatics
Sep 22, 2023
Briefings in bioinformatics
Sep 22, 2023