Abstract

Fungal infections have become a serious health concern for human beings worldwide. Fungal infections usually occur when the invading fungus appear on a particular part of the body and become hard for the human immune system to resist. The existing antifungal treatments are considered inappropriate because of their severe side effects. With the rapid growth of this chronic disease across the world, an accurate prediction model for fungal infections has become a challenging task for scientists. To cope with these issues, several prediction methods have been established for antifungal peptides. However, due to the limited and unsatisfactory performance of these methods, it is still highly indispensable to develop an effective and reliable model of antifungal peptides. In this study, we present an intelligent learning approach for the accurate prediction of antifungal peptides. The sequential and evolutionary features are explored by three promising descriptors namely conjoint triad feature (CTF), Pseudo-position specific scoring matrix (PsePSSM), and Position-specific scoring matrix-Discrete wavelet transform (PSSM-DWT). Moreover, the extracted vectors of the encoding methods are then fused to get multi-perspective descriptors representing both sequential and evolutionary features. In addition, to reduce the size of the multi-information vector and to eradicate noisy and irrelevant descriptors, we applied minimum redundancy and maximum relevance (mRMR) based feature selection to choose the optimal feature set. In the next step, the selected feature vector is evaluated via four different machine learning models, i.e. Fuzzy K-nearest neighbor (FKNN), Random Forest (RF),k-nearest neighbor (KNN), and Support Vector Machine (SVM). In addition, the predicted labels of the individual learning algorithms are then provided to the genetic algorithm to form an ensemble classifier to further boost the prediction results. Furthermore, the SHAP and LIME methods were used to interpret the contribution of features to model predictions. Our proposed iAFPs-EnC-GA model achieved a higher prediction accuracy of 97.81% and 93.92% using training and independent datasets, respectively. Which is ∼4% higher than existing models. It is suggested that the “iAFPs-EnC-GA” model will be a valuable tool for scientists and might play a key role in drug development and academic research. The source code and all datasets are publicly available at https://github.com/farmanit335/iAFPs-EnC-GA.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call