JASPER: A fast genome polishing tool that improves accuracy of genome assemblies.

Alina Guo,Aleksey V Zimin,Steven L Salzberg

doi:10.1371/journal.pcbi.1011032

Alina Guo, Aleksey V Zimin + Show 1 more

Open Access

https://doi.org/10.1371/journal.pcbi.1011032

Copy DOI

Journal: PLOS Computational Biology	Publication Date: Mar 31, 2023
Citations: 5	License type: CC BY 4.0

Affiliation: Johns Hopkins University

Abstract

Advances in long-read sequencing technologies have dramatically improved the contiguity and completeness of genome assemblies. Using the latest nanopore-based sequencers, we can generate enough data for the assembly of a human genome from a single flow cell. With the long-read data from these sequences, we can now routinely produce de novo genome assemblies in which half or more of a genome is contained in megabase-scale contigs. Assemblies produced from nanopore data alone, though, have relatively high error rates and can benefit from a process called polishing, in which more-accurate reads are used to correct errors in the consensus sequence. In this manuscript, we present a novel tool for genome polishing called JASPER (Jellyfish-based Assembly Sequence Polisher for Error Reduction). In contrast to many other polishing methods, JASPER gains efficiency by avoiding the alignment of reads to the assembly. Instead, JASPER uses a database of k-mer counts that it creates from the reads to detect and correct errors in the consensus. Our experiments demonstrate that JASPER is faster than alignment-based polishers, and both faster and more accurate than other k-mer based polishing methods. We also introduce the idea of using a polishing tool to create population-specific reference genomes, and illustrate this idea using sequence data from multiple individuals from Tokyo, Japan.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

JASPER: A fast genome polishing tool that improves accuracy of genome assemblies.

Abstract

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Long-read sequencing in ecology and evolution: Understanding how complex genetic and epigenetic variants shape biodiversity.
Dan G Bock ... Polina Novikova
Molecular Ecology | VOL. 32
Dan G Bock, et. al.Dan G Bock ... Polina Novikova
01 Mar 2023
Molecular Ecology | VOL. 32

Highly accurate long reads are crucial for realizing the potential of biodiversity genomics
Scott Hotaling ... Paul B Frandsen
BMC genomics | VOL. 24
Scott Hotaling, et. al.Scott Hotaling ... Paul B Frandsen
16 Mar 2023
BMC genomics | VOL. 24

MetaCONNET: A metagenomic polishing tool for long-read assemblies.
Bingru Sun ... Hui Tian
PloS one | VOL. 19
Bingru Sun, et. al.Bingru Sun ... Hui Tian
03 Dec 2024
PloS one | VOL. 19

SVcnn: an accurate deep learning-based method for detecting structural variation based on long-read data
Yan Zheng ... Xuequn Shang
BMC Bioinformatics | VOL. 24
Yan Zheng, et. al.Yan Zheng ... Xuequn Shang
23 May 2023
BMC Bioinformatics | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

JASPER: A fast genome polishing tool that improves accuracy of genome assemblies.

Abstract

Talk to us

Similar Papers

More From: PLOS Computational Biology