Abstract

Salmonella enterica and Escherichia coli are bacterial species that colonize different animal hosts with sub-types that can cause life-threatening infections in humans. Source attribution of zoonoses is an important goal for infection control as is identification of isolates in reservoir hosts that represent a threat to human health. In this study, host specificity and zoonotic potential were predicted using machine learning in which Support Vector Machine (SVM) classifiers were built based on predicted proteins from whole genome sequences. Analysis of over 1000 S. enterica genomes allowed the correct prediction (67 –90 % accuracy) of the source host for S. Typhimurium isolates and the same classifier could then differentiate the source host for alternative serovars such as S. Dublin. A key finding from both phylogeny and SVM methods was that the majority of isolates were assigned to host-specific sub-clusters and had high host-specific SVM scores. Moreover, only a minor subset of isolates had high probability scores for multiple hosts, indicating generalists with genetic content that may facilitate transition between hosts. The same approach correctly identified human versus bovine E. coli isolates (83 % accuracy) and the potential of the classifier to predict a zoonotic threat was demonstrated using E. coli O157. This research indicates marked host restriction for both S. enterica and E. coli, with only limited isolate subsets exhibiting host promiscuity by gene content. Machine learning can be successfully applied to interrogate source attribution of bacterial isolates and has the capacity to predict zoonotic potential.

Highlights

  • Salmonella enterica and Escherichia coli can be isolated from a large number of animal hosts, in particular birds and mammals

  • The accessory genome indicates some clustering by host for S. Typhimurium (STm), especially for avian and human isolates, but many of the STm isolates from different hosts were interspersed within several branches containing isolates of mixed origin

  • In this study we wanted to determine if a machine learning approach, Support Vector Machine (SVM), could assign the isolation host/habitat for both S. enterica and E. coli isolates based on analysis of differential predicted protein variants (PVs)

Read more

Summary

Introduction

Salmonella enterica and Escherichia coli can be isolated from a large number of animal hosts, in particular birds and mammals. S. enterica serovars are usually associated with disease whereas the majority of E. coli are commensals with only a subset considered overt pathogens [1, 2]. Infections caused by these two genera are a major burden on human morbidity and mortality and many of these infections are zoonotic, i.e. are transmitted from animals to humans. Enteritidis are often restricted to gastrointestinal disease in their different hosts This differentiation is increasingly appearing simplistic with identification of invasive strains of STm, such as ST313, in humans [3,4,5]. From a public health perspective, the capacity to ascribe correctly the source of an infection is 000135 ã 2017 The Authors

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call