Abstract

BlastFrost is a highly efficient method for querying 100,000s of genome assemblies, building on Bifrost, a dynamic data structure for compacted and colored de Bruijn graphs. BlastFrost queries a Bifrost data structure for sequences of interest and extracts local subgraphs, enabling the identification of the presence or absence of individual genes or single nucleotide sequence variants. We show two examples using Salmonella genomes: finding within minutes the presence of genes in the SPI-2 pathogenicity island in a collection of 926 genomes and identifying single nucleotide polymorphisms associated with fluoroquinolone resistance in three genes among 190,209 genomes. BlastFrost is available at https://github.com/nluhmann/BlastFrost/tree/master/data.

Highlights

  • Recent advances in DNA sequencing technologies have reduced sequencing costs and hands-on time, and whole-genome sequencing of bacterial pathogens is being routinely performed by public health organizations

  • Uncompacted de Bruijn graphs of genomic sequences are a popular graph data structure consisting of nodes representing sequences of k-mers within the input genomes

  • EnteroBase includes genomic assemblies of 100,000s of bacterial strains together with genotypes based on legacy or core genome MLST, which facilitate the

Read more

Summary

Introduction

Recent advances in DNA sequencing technologies have reduced sequencing costs and hands-on time, and whole-genome sequencing of bacterial pathogens is being routinely performed by public health organizations. BlastFrost uses the underlying Bifrost graph structure to extract subgraphs defined by a query, and can thereby efficiently extract sequence variants of the query from a data base of 100,000s of bacterial genomes. Bifrost [18] indexes bacterial genomes in a time and memory efficient implementation of a compacted and colored de Bruijn graph.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call