Abstract

Background: Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing to achieve the draft genomic sequences within a few days. However, the assemblies with shotgun sequencing are usually composed of hundreds of contigs. A further multiplex PCR procedure is needed to fill all the gaps and link contigs into complete chromosomal sequence, which is the basis for prokaryotic comparative genomic studies. In this article, we study various pyrosequencing strategies by simulated assembling from 100 prokaryotic genomes. Findings: Simulation study shows that a single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) can produce: 1) ~90% of 100 assemblies with 99.99%; 4) average false gene duplication rate is < 0.7%; 5) average false gene loss rate is < 0.4%. Conclusions: A single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) is a cost-effective way for prokaryotic whole genome sequencing. This strategy provides solution to produce high quality draft assemblies for most of prokaryotic organisms within days. Due to the small number of assembled scaffolds, the following multiplex PCR procedure (for gap filling) would be easy. As a result, large scale prokaryotic whole genome sequencing projects may be finished within weeks.

Highlights

  • Despite a decrease in the rate of mortality due to diarrhea in the past few decades, diarrhea remains one of the leading causes of childhood deaths worldwide, especially in developing countries

  • Our simulation shows the following: first, a single-end 454 Jr Titanium run combined with a paired-end 454 Jr Titanium run may assemble about 90% of 100 genomes into

  • We evaluated the performance of ScaffViz on seven datasets of varying size and complexity

Read more

Summary

Introduction

Despite a decrease in the rate of mortality due to diarrhea in the past few decades, diarrhea remains one of the leading causes of childhood deaths worldwide, especially in developing countries. Recent genome-wide association studies (GWAS) have identified allele T of a single nucleotide polymorphism (SNP), rs2294008, in the prostate stem cell antigen (PSCA) gene as a risk factor for bladder cancer [1,2]. A recent genome-wide association study (GWAS) of bladder cancer identified a single nucleotide polymorphism (SNP), rs11892031, within the UGT1A gene cluster on chromosome 2q37.1, as a novel risk factor. Genome-wide association studies (GWAS) of human complex disease have identified a large number of disease-associated genetic loci, which are distinguished by distinctive frequencies of specific single nucleotide polymorphisms (SNPs) in individuals with a particular disease These data do not provide direct information on the biological basis http://genomebiology.com/supplements/12/S1 of a disease or on the underlying mechanisms. There may be multiple paths in the de Bruijn graph that can yield sequences with optical maps that match the genome’s optical map, these paths all yield very similar sequences in most cases

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call