Abstract

Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC >0.7 for a diverse test dataset and AUC >0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call