Abstract

This paper proposes a NPEL (Nine-Pipeline & Ensemble Learning) strategy based on machine Learning algorithm to predict protein-protein interaction hotspots by training amino acid composition, surface area, amino acid chains and other complex/interface-related structural information. We applied Random Forest, Linear Svm, KNN, Gaussian Naive Bayes, Multi-layer Perceptron Neural Network, Adaboost, XGBoost etc. nine machine learning algorithms combination into an independent pipeline to predict protein hot spots, and the final results are optimized through voting and stacking scheme. In the stacking result of XGBoost and Logistic Regression, the highest accuracy is 0.8462 and improve the indicators of the pipeline results greatly.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.