Abstract

Background Legionella, the causative agent for Legionnaires’ disease, is ubiquitous in both natural and man-made aquatic environments. The distribution of Legionella genotypes within clinical strains is significantly different from that found in environmental strains. Developing novel genotypic methods that offer the ability to distinguish clinical from environmental strains could help to focus on more relevant (virulent) Legionella species in control efforts. Mixed-genome microarray data can be used to perform a comparative-genome analysis of strain collections, and advanced statistical approaches, such as the Random Forest algorithm are available to process these data.MethodsMicroarray analysis was performed on a collection of 222 Legionella pneumophila strains, which included patient-derived strains from notified cases in the Netherlands in the period 2002–2006 and the environmental strains that were collected during the source investigation for those patients within the Dutch National Legionella Outbreak Detection Programme. The Random Forest algorithm combined with a logistic regression model was used to select predictive markers and to construct a predictive model that could discriminate between strains from different origin: clinical or environmental.ResultsFour genetic markers were selected that correctly predicted 96% of the clinical strains and 66% of the environmental strains collected within the Dutch National Legionella Outbreak Detection Programme.ConclusionsThe Random Forest algorithm is well suited for the development of prediction models that use mixed-genome microarray data to discriminate between Legionella strains from different origin. The identification of these predictive genetic markers could offer the possibility to identify virulence factors within the Legionella genome, which in the future may be implemented in the daily practice of controlling Legionella in the public health environment.

Highlights

  • The bacterium Legionella is the causative agent for Legionnaires’ disease, an acute pneumonia that accounts for a significant amount of community-acquired pneumonias [1,2,3], and proves fatal in about 6–8.5% of diagnosed cases [4,5]

  • We described the development of a mixedstrain microarray using comparative genome hybridization (CGH), that contained genetic data from both clinical and environmental strains [12]

  • In order to attempt to maximize the sensitivity of the model, while minimizing the loss of specificity, we used an arbitrary cut-off value of 0.06 to translate the logistic regression prediction of the training dataset into a classification of the clinical and environmental strains

Read more

Summary

Introduction

The bacterium Legionella is the causative agent for Legionnaires’ disease, an acute pneumonia that accounts for a significant amount of community-acquired pneumonias (ranging from 1.9– 20%) [1,2,3], and proves fatal in about 6–8.5% of diagnosed cases [4,5]. Together with the implementation of new governmental laws and guidelines to prevent growth of Legionella bacteria in potential sources, it was attempted to diminish the overall impact of Legionnaires’ disease in the Netherlands. The development of novel genotypic methods that offer the ability to distinguish clinical from environmental strains could form a welcome step in focusing more on relevant (virulent) Legionella species in control efforts. Developing novel genotypic methods that offer the ability to distinguish clinical from environmental strains could help to focus on more relevant (virulent) Legionella species in control efforts. Mixed-genome microarray data can be used to perform a comparative-genome analysis of strain collections, and advanced statistical approaches, such as the Random Forest algorithm are available to process these data

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.