Abstract

High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner’s guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer.

Highlights

  • Introduction and aimsHigh throughput sequencing is fast and cheap enough to be considered part of the toolbox for investigating bacteria [1,2]

  • This work is performed by diverse groups of individuals including researchers, public health practitioners and clinicians, interested in a wide array of topics related to bacterial genetics and evolution

  • Bacterial genome sequences can be generated in-house in many labs, in a matter of hours or days using benchtop sequencers such as the Illumina MiSeq, Ion Torrent PGM or Roche 454 FLX Junior [1,2]. Much of this data is available in the public domain, allowing for extensive comparative analysis; e.g. in February 2013 the GenBank database included >6,500 bacterial genome

Read more

Summary

Introduction

Introduction and aimsHigh throughput sequencing is fast and cheap enough to be considered part of the toolbox for investigating bacteria [1,2]. Bacterial genome sequences can be generated in-house in many labs, in a matter of hours or days using benchtop sequencers such as the Illumina MiSeq, Ion Torrent PGM or Roche 454 FLX Junior [1,2] Much of this data is available in the public domain, allowing for extensive comparative analysis; e.g. in February 2013 the GenBank database included >6,500 bacterial genome. In this beginner’s guide, we aim to provide an entry point for individuals wanting to make use of wholegenome sequence data for the de novo assembly of genomes to answer questions in the context of their broader research goals. The guide is not intended to be exhaustive, but to introduce a set of simple but flexible and free tools that can be used to investigate a variety of common questions including (i) how does this genome compare to that one?, and (ii) does this genome have plasmids, phage or resistance genes? Each section includes guidance on where to find more detailed technical information, alternative software packages and where to look for more sophisticated approaches

Objectives
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call