Abstract

Adhesion is the foremost step in pathogenesis and biofilm formation and is facilitated by a special class of cell wall proteins known as adhesins. Formation of biofilms in catheters and other medical devices subsequently leads to infections. As compared to bacterial adhesins, there is relatively less work for the characterization and identification of fungal adhesins. Understanding the sequence characterization of fungal adhesins may facilitate a better understanding of its role in pathogenesis. Experimental methods for investigation and characterization of fungal adhesins are labor intensive and expensive. Therefore, there is a need for fast and efficient computational methods for the identification and characterization of fungal adhesins. The aim of the current study is twofold: (i) to develop an accurate predictor for fungal adhesins, (ii) to sieve out the prominent molecular signatures present in fungal adhesins. Of the many supervised learning algorithms implemented in the current study, voting ensembles resulted in enhanced prediction accuracy. The best voting-ensemble consisting of three support vector machines with three different kernels (PolyK, RBF, PuK) achieved an accuracy of 94.9% on leave one out cross validation and 98.0% accuracy on blind testing set. A preference/avoidance list of molecular features as well as human interpretable rules are also extracted giving insights into the general sequence features of fungal adhesins. Fungal adhesins are characterized by high Threonine and Cysteine and avoidance for Phenylalanine and Methionine. They also have avoidance for average hydrophilicity. The current analysis possibly will facilitate the understanding of the mechanism of fungal adhesin function which may further help in designing methods for restricting adhesin mediated pathogenesis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call