Abstract

Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium's ability to cause disease. We therefore asked whether it was possible to predict the virulence of P. aeruginosa isolates based on their genomic content. We applied a machine learning approach to a genetically and phenotypically diverse collection of 115 clinical P. aeruginosa isolates using genomic information and corresponding virulence phenotypes in a mouse model of bacteremia. We defined the accessory genome of these isolates through the presence or absence of accessory genomic elements (AGEs), sequences present in some strains but not others. Machine learning models trained using AGEs were predictive of virulence, with a mean nested cross-validation accuracy of 75% using the random forest algorithm. However, individual AGEs did not have a large influence on the algorithm's performance, suggesting instead that virulence predictions are derived from a diffuse genomic signature. These results were validated with an independent test set of 25 P. aeruginosa isolates whose virulence was predicted with 72% accuracy. Machine learning models trained using core genome single-nucleotide variants and whole-genome k-mers also predicted virulence. Our findings are a proof of concept for the use of bacterial genomes to predict pathogenicity in P. aeruginosa and highlight the potential of this approach for predicting patient outcomes.IMPORTANCEPseudomonas aeruginosa is a clinically important Gram-negative opportunistic pathogen. P. aeruginosa shows a large degree of genomic heterogeneity both through variation in sequences found throughout the species (core genome) and through the presence or absence of sequences in different isolates (accessory genome). P. aeruginosa isolates also differ markedly in their ability to cause disease. In this study, we used machine learning to predict the virulence level of P. aeruginosa isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. This study provides a machine learning framework to investigate relationships between bacterial genomes and complex phenotypes such as virulence.

Highlights

  • Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium’s ability to cause disease

  • We used two previously reported collections, 98 archived isolates from adults with bacteremia at Northwestern Memorial Hospital (NMH) in Chicago, IL, USA [32], and 17 isolates from children with Shanghai fever, a P. aeruginosa infection presenting with sepsis and gastrointestinal symptoms, at Chang Gung Children’s Hospital in Taiwan [33]

  • These results provided a large collection of P. aeruginosa isolates with known whole-genome sequences and virulence in a mouse bacteremia model

Read more

Summary

Introduction

Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium’s ability to cause disease. We used machine learning to predict the virulence level of P. aeruginosa isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. Supervised machine learning is an increasingly important tool in bacterial genomics and has been extensively applied to the prediction of antimicrobial resistance and identification of potential resistance determinants This approach has proven successful in a variety of species and using a variety of genomic features [21,22,23,24,25,26,27]. To our knowledge there has been no study directly modeling P. aeruginosa pathogenicity from genomic content

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.