Abstract

Pseudomonas aeruginosa is a Gram-negative bacillus included among the six “ESKAPE” microbial species with an outstanding ability to “escape” currently used antibiotics and developing new antibiotics against it is of the highest priority. Whereas minimum inhibitory concentration (MIC) values against Pseudomonas aeruginosa have been used previously for QSAR model development, disk diffusion results (inhibition zones) have not been apparently used for this purpose in the literature and we decided to explore their use in this sense. We developed multiple QSAR methods using several machine learning algorithms (support vector classifier, K nearest neighbors, random forest classifier, decision tree classifier, AdaBoost classifier, logistic regression and naïve Bayes classifier). We used four sets of molecular descriptors and fingerprints and three different methods of data balancing, together with the “native” data set. In total, 32 models were built for each set of descriptors or fingerprint and balancing method, of which 28 were selected and stacked to create meta-models. In terms of balanced accuracy, the best performance was provided by KNN, logistic regression and decision tree classifier, but the ensemble method had slightly superior results in nested cross-validation.

Highlights

  • Pseudomonas aeruginosa is a Gram-negative bacillus, widespread in various environments, from soil to water and from plants to animals [1]

  • 28 models were created with the molecular descriptors and stacked to create a meta-model. The latter was built by applying the logistic regression algorithm to the predicted probabilities of the individual models

  • positive predictive value (PPV) ranged between 11.72% and 76.31%; the stacked model had a mean of 38.99% (s.d. 1.19%)

Read more

Summary

Introduction

Pseudomonas aeruginosa is a Gram-negative bacillus, widespread in various environments, from soil to water and from plants to animals [1]. Whereas in healthy people it seldom triggers disease, in patients with a weakened immune system it may quickly proliferate and trigger a range of serious acute and chronic infections, being an opportunistic pathogen [1,2] It is the critical pathogen responsible for the morbidity and mortality associated with cystic fibrosis, as well as one of the major microbes causing nosocomial infections [3]. QSAR methods are very popular, in this sense being stated that “one would say that nowadays no drug is developed without previous QSAR analyses” [14]. They are computational methods that attempt creating relationships between chemical structure features of a set of compounds and one of their biological activities expressed numerically [15]. We report on QSAR models developed for substances active against Pseudomonas aeruginosa, using IZ values from the ChEMBL database

Results
Performance of Models Built with Molecular Descriptors
Performance of Models Built with Molecular Fingerprints
Y-Randomization
External Validation
Outliers and Applicability Domain
Descriptors
Discussion
The Dataset
Descriptors and Feature Selection
Classification Algorithms
Performance Evaluation
Outlier Detection and Applicability Domain
Conclusions
35. The European Committee on Antimicrobial Susceptibility Testing
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call