Abstract
BackgroundNext generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest. However, most draft genomes are just collections of independent contigs, whose relative positions and orientations along the genome being sequenced are unknown. Although several tools have been developed to order and orient the contigs of draft genomes, more accurate tools are still needed.ResultsIn this study, we present a novel reference-based contig assembly (or scaffolding) tool, named as CAR, that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism. Given a set of contigs in multi-FASTA format and a reference genome in FASTA format, CAR can output a list of scaffolds, each of which is a set of ordered and oriented contigs. For validation, we have tested CAR on a real dataset composed of several prokaryotic genomes and also compared its performance with several other reference-based contig assembly tools. Consequently, our experimental results have shown that CAR indeed performs better than all these other reference-based contig assembly tools in terms of sensitivity, precision and genome coverage.ConclusionsCAR serves as an efficient tool that can more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome. The web server of CAR is freely available at http://genome.cs.nthu.edu.tw/CAR/ and its stand-alone program can also be downloaded from the same website.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0381-3) contains supplementary material, which is available to authorized users.
Highlights
Generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest
We present a novel reference-based contig assembly tool named as CAR that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism
Testing dataset For validation, we used a real dataset composed of several prokaryotic genomes to test CAR and compared its performance to eight other reference-based tools of contig assembly, namely Projector2 [6], OSLay [7], ABACAS [8], Mauve Aligner [9], fillScaffolds [10], r2cat [11], CONTIGuator [12] and SIS [13]
Summary
We present a novel reference-based contig assembly (or scaffolding) tool, named as CAR, that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism. Given a set of contigs in multi-FASTA format and a reference genome in FASTA format, CAR can output a list of scaffolds, each of which is a set of ordered and oriented contigs. We have tested CAR on a real dataset composed of several prokaryotic genomes and compared its performance with several other reference-based contig assembly tools. Our experimental results have shown that CAR performs better than all these other reference-based contig assembly tools in terms of sensitivity, precision and genome coverage
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have