Benchmarking of long-read assemblers for prokaryote whole genome sequencing.

Ryan R Wick,Kathryn E Holt

doi:10.12688/f1000research.21782.1

Ryan R Wick, Kathryn E Holt

Open Access

https://doi.org/10.12688/f1000research.21782.1

Copy DOI

Abstract

Background:Data sets from long-read sequencing platforms (Oxford Nanopore Technologies and Pacific Biosciences) allow for most prokaryote genomes to be completely assembled - one contig per chromosome or plasmid. However, the high per-read error rate of long-read sequencing necessitates different approaches to assembly than those used for short-read sequencing. Multiple assembly tools (assemblers) exist, which use a variety of algorithms for long-read assembly. Methods:We used 500 simulated read sets and 120 real read sets to assess the performance of six long-read assemblers (Canu, Flye, Miniasm/Minipolish, Raven, Redbean and Shasta) across a wide variety of genomes and read parameters. Assemblies were assessed on their structural accuracy/completeness, sequence identity, contig circularisation and computational resources used. Results:Canu v1.9 produced moderately reliable assemblies but had the longest runtimes of all assemblers tested. Flye v2.6 was more reliable and did particularly well with plasmid assembly. Miniasm/Minipolish v0.3 was the only assembler which consistently produced clean contig circularisation. Raven v0.0.5 was the most reliable for chromosome assembly, though it did not perform well on small plasmids and had circularisation issues. Redbean v2.5 and Shasta v0.3.0 were computationally efficient but more likely to produce incomplete assemblies. Conclusions:Of the assemblers tested, Flye, Miniasm/Minipolish and Raven performed best overall. However, no single tool performed well on all metrics, highlighting the need for continued development on long-read assembly algorithms.

Highlights

Data sets from long-read sequencing platforms (Oxford Nanopore Technologies and Pacific Biosciences) allow for most prokaryote genomes to be completely assembled – one contig per chromosome or plasmid
Figure 1A/Figure 2A shows the proportion of read sets with each assembly status
For the real read sets, a higher proportion of completed assemblies indicates a more reliable assembler – one which is likely to make a completed assembly given a typical set of input reads

Summary

Introduction

Data sets from long-read sequencing platforms (Oxford Nanopore Technologies and Pacific Biosciences) allow for most prokaryote genomes to be completely assembled – one contig per chromosome or plasmid. Methods: We used 500 simulated read sets and 120 real read sets to assess the performance of eight long-read assemblers (Canu, Flye, Miniasm/Minipolish, NECAT, NextDenovo/NextPolish, Raven, Redbean and Shasta) across a wide variety of genomes and read parameters. Results: Canu v2.0 produced reliable assemblies and was good with plasmids, but it performed poorly with circularisation and had the longest runtimes of all assemblers tested. Miniasm/Minipolish v0.3/v0.1.3 was the most likely to produce clean contig circularisation. NECAT v20200119 was reliable and good at circularisation but tended to make larger sequence errors. Raven v1.1.10 was the most reliable for chromosome assembly, though it did not perform well on small plasmids and had circularisation issues.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: F1000Research	Publication Date: Dec 23, 2019
Citations: 155	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Benchmarking of long-read assemblers for prokaryote whole genome sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research

Lead the way for us

Similar Papers

Benchmarking of long-read assemblers for prokaryote whole genome sequencing.
Ryan R Wick ... Kathryn E Holt
F1000Research | VOL. 8
Ryan R Wick, et. al.Ryan R Wick ... Kathryn E Holt
01 Feb 2021
F1000Research | VOL. 8

Benchmarking of long-read assemblers for prokaryote whole genome sequencing
Ryan R Wick ... Ryan Wick
F1000Research | VOL. 8
Ryan R Wick, et. al.Ryan R Wick ... Ryan Wick
15 Apr 2020
F1000Research | VOL. 8

Benchmarking of long-read assemblers for prokaryote whole genome sequencing.
Ryan R Wick ... Kathryn E Holt
F1000Research | VOL. 8
Ryan R Wick, et. al.Ryan R Wick ... Kathryn E Holt
22 Apr 2020
F1000Research | VOL. 8

Benchmarking of long-read assemblers for prokaryote whole genome sequencing.
Ryan R Wick ... Kathryn E Holt
F1000Research | VOL. 8
Ryan R Wick, et. al.Ryan R Wick ... Kathryn E Holt
17 Sep 2020
F1000Research | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Benchmarking of long-read assemblers for prokaryote whole genome sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research