Abstract

Whole genome sequence analysis of prokaryotes is fundamentally important in understanding human infections, development of diagnostics and vaccines, biodefense studies, antimicrobial target identification and drug design. Rapid advances in sequencing technology have provided the capability to quickly and cheaply produce several hundreds of prokaryotic genomes each year. The next generation sequencing platforms (454 from Roche, Solexa of Illumina, and SoLiD from ABI) hold promise to further reduce time and cost of whole genome sequencing. Multiple species of bacteria and hundreds of strains thereof are being sequenced every year, thanks to cutting edge approaches such as re-sequencing wherein genome sequence of a reference organism is used as a scaffold to direct analysis of several different strains [1]. Using this method, multiple whole-genome bacterial sequencing projects can now be completed in less than two weeks instead of months. The total number of completed genomes (including reference and strain re-sequencing projects) is consistently doubling every 16 months [2] by adding about 20 new genomes every month [3]. By the end of March 2009, a total of 1775 prokaryotic genome sequences and draft assemblies were available in the NCBI genome database. At this pace of sequencing output, study of a single bacterial genome has become almost pedestrian while the comparisons of multiple strains of a single species is within the relatively easy reach. Comparison of genomic sequences has revealed mechanism of changes in bacterial lifestyles. We have learned how species have evolved strategies to survive and compete as part of adaptation to their preferred hosts, habitats or niches. Genomic comparison of multiple species and strains has facilitated insights into adaptive mechanisms leading to host or tissue tropism. Such inferences however need to be tested functionally and thus the need for integration of genome data with cues obtainable from downstream ‘omic’ experiments that have sampled a variety of conditions or treatments. New informatic approaches are emerging which are capable of integrating genomic and functional datasets and also making use of data available through published resources. The emergence of e-Science, Semantic Web, and Science 2.0 approaches hold a lot of promise for holistic data integration and meaningful interpretation of community genomics and microarray experiments in an interactive and collaborative fashion. The present overview discusses some of these issues and ideas in relation to the ‘PLoS ONE prokaryotic genomes collection’.

Highlights

  • Whole genome sequence analysis of prokaryotes is fundamentally important in understanding human infections, development of diagnostics and vaccines, biodefense studies, antimicrobial target identification and drug design

  • Multiple species of bacteria and hundreds of strains thereof are being sequenced every year, thanks to cutting edge approaches such as resequencing wherein genome sequence of a reference organism is used as a scaffold to direct analysis of several different strains [1]

  • While the three organisms share a large chunk of genes, major differences exist in terms of their flexible genome component such as prophages and insertional sequences [4]

Read more

Summary

Niyaz Ahmed*

Complete genome sequences of important bacterial pathogens and industrial organisms hold significant consequences and opportunities for human health, industry and the environment. Addressing biological and clinical problems through genome sequence based approaches offers many commercial opportunities. The aftermath of whole genome sequencing has revealed new insights into evolution of bacterial lifestyles including strategies for adaptation to new niches and overcoming competitors. Whole genome sequences representing more than 1500 prokaryotic organisms combined with the dozens (to hundreds) of strain re-sequencing projects are posing mind boggling problems on the optimal utilization of the resultant ‘omic’ datasets. Microbiologists are confronted with the challenge to translate these data into better human and animal healthcare solutions and pursue basic research approaches to interpret the data in ecological and evolutionary perspectives. New informatic approaches towards optimal utilization, holistic integration and meaningful interpretation of the genome sequence data are extremely necessary

Introduction
Why sequence multiple species and strains?
Findings
Making sense of the genome piles
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call