Abstract

BackgroundPredictive genomics is the translation of raw genome sequence data into a phenotypic assessment of the organism. For bacterial pathogens, these phenotypes can range from environmental survivability, to the severity of human disease. Significant progress has been made in the development of generic tools for genomic analyses that are broadly applicable to all microorganisms; however, a fundamental missing component is the ability to analyze genomic data in the context of organism-specific phenotypic knowledge, which has been accumulated from decades of research and can provide a meaningful interpretation of genome sequence data.ResultsIn this study, we present SuperPhy, an online predictive genomics platform (http://lfz.corefacility.ca/superphy/) for Escherichia coli. The platform integrates the analytical tools and genome sequence data for all publicly available E. coli genomes and facilitates the upload of new genome sequences from users under public or private settings. SuperPhy provides real-time analyses of thousands of genome sequences with results that are understandable and useful to a wide community, including those in the fields of clinical medicine, epidemiology, ecology, and evolution. SuperPhy includes identification of: 1) virulence and antimicrobial resistance determinants 2) statistical associations between genotypes, biomarkers, geospatial distribution, host, source, and phylogenetic clade; 3) the identification of biomarkers for groups of genomes on the based presence/absence of specific genomic regions and single-nucleotide polymorphisms and 4) in silico Shiga-toxin subtype.ConclusionsSuperPhy is a predictive genomics platform that attempts to provide an essential link between the vast amounts of genome information currently being generated and phenotypic knowledge in an organism-specific context.

Highlights

  • Predictive genomics is the translation of raw genome sequence data into a phenotypic assessment of the organism

  • Pan-genome At the time of writing, 2324 publicly available E. coli genomes from GenBank had been analyzed for incorporation into the SuperPhy platform [37]

  • The conserved core is used within SuperPhy to identify single-nucleotide polymorphism (SNP) that are used in phylogenetic tree building, as well as in the quality filtering of uploaded genomes

Read more

Summary

Results

We present SuperPhy, an online predictive genomics platform (http://lfz.corefacility.ca/ superphy/) for Escherichia coli. The platform integrates the analytical tools and genome sequence data for all publicly available E. coli genomes and facilitates the upload of new genome sequences from users under public or private settings. SuperPhy provides real-time analyses of thousands of genome sequences with results that are understandable and useful to a wide community, including those in the fields of clinical medicine, epidemiology, ecology, and evolution. SuperPhy includes identification of: 1) virulence and antimicrobial resistance determinants 2) statistical associations between genotypes, biomarkers, geospatial distribution, host, source, and phylogenetic clade; 3) the identification of biomarkers for groups of genomes on the based presence/absence of specific genomic regions and single-nucleotide polymorphisms and 4) in silico Shiga-toxin subtype

Conclusions
Background
Results and discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.