Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications.

Keith A Jolley,Martin C J Maiden,James E Bray

doi:10.12688/wellcomeopenres.14826.1

Abstract

The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species and genera. Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes. Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications. The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation. Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes. There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question. In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines.

Highlights

Our ability to study complex phenotypes, i.e. those that depend on the interactions of multiple components of an organism and its environment, have been enhanced during the past 20 years by the very large increases in our capacity to collect and analyse biological information
The PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes
We describe developments in the Bacterial Isolate Genome Sequence Database (BIGSdb) software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications

Summary

Introduction

Our ability to study complex phenotypes, i.e. those that depend on the interactions of multiple components of an organism and its environment, have been enhanced during the past 20 years by the very large increases in our capacity to collect and analyse biological information. Amongst the most important of these developments have been very high-throughput sequencing methods and the informatics approaches required to interpret the large volumes of data that they generate; at the time of writing, there remain major challenges in realising the potential of the opportunities presented by such developments[1]. These data must be stored, organised, curated, interpreted, analysed, and disseminated in a usable way. The gene-by-gene approach exemplified by MLST is inherently scalable with respect to the number of loci and individual organisms included[16] and the BIGSdb platform has been continually developed and extended to provide additional functionality

Methods

Findings

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Wellcome open research	Publication Date: Sep 24, 2018
Citations: 1756	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome open research

Lead the way for us

Similar Papers

Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications
James E Bray ... Martin C J Maiden
Wellcome open research | VOL. 3
James E Bray, et. al.James E Bray ... Martin C J Maiden
08 Oct 2018
Wellcome open research | VOL. 3

BIGSdb: Scalable analysis of bacterial genome variation at the population level
Keith A Jolley ... Martin Cj Maiden
BMC bioinformatics | VOL. 11
Keith A Jolley, et. al.Keith A Jolley ... Martin Cj Maiden
01 Dec 2010
BMC bioinformatics | VOL. 11

Clonal Diversity of Candida auris, Candida blankii, and Kodamaea ohmeri Isolated from Septicemia and Otomycosis in Bangladesh as Determined by Multilocus Sequence Typing.
Fardousi Akter Sathi ... Sangjukta Roy
Journal of fungi (Basel, Switzerland) | VOL. 9
Fardousi Akter Sathi, et. al.Fardousi Akter Sathi ... Sangjukta Roy
12 Jun 2023
Journal of fungi (Basel, Switzerland) | VOL. 9

Molecular characterization of Treponema pallidum subsp. pallidum in Switzerland and France with a new multilocus sequence typing scheme.
Linda Grillová ... Homayoun C Bagheri
PloS one | VOL. 13
Linda Grillová, et. al.Linda Grillová ... Homayoun C Bagheri
30 Jul 2018
PloS one | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome open research