Abstract

Oxford Nanopore sequencing can be used to achieve complete bacterial genomes. However, the error rates of Oxford Nanopore long reads are greater compared to Illumina short reads. Long-read assemblers using a variety of assembly algorithms have been developed to overcome this deficiency, which have not been benchmarked for genomic analyses of bacterial pathogens using Oxford Nanopore long reads. In this study, long-read assemblers, namely Canu, Flye, Miniasm/Racon, Raven, Redbean, and Shasta, were thus benchmarked using Oxford Nanopore long reads of bacterial pathogens. Ten species were tested for mediocre- and low-quality simulated reads, and 10 species were tested for real reads. Raven was the most robust assembler, obtaining complete and accurate genomes. All Miniasm/Racon and Raven assemblies of mediocre-quality reads provided accurate antimicrobial resistance (AMR) profiles, while the Raven assembly of Klebsiella variicola with low-quality reads was the only assembly with an accurate AMR profile among all assemblers and species. All assemblers functioned well for predicting virulence genes using mediocre-quality and real reads, whereas only the Raven assemblies of low-quality reads had accurate numbers of virulence genes. Regarding multilocus sequence typing (MLST), Miniasm/Racon was the most effective assembler for mediocre-quality reads, while only the Raven assemblies of Escherichia coli O157:H7 and K. variicola with low-quality reads showed positive MLST results. Miniasm/Racon and Raven were the best performers for MLST using real reads. The Miniasm/Racon and Raven assemblies showed accurate phylogenetic inference. For the pan-genome analyses, Raven was the strongest assembler for simulated reads, whereas Miniasm/Racon and Raven performed the best for real reads. Overall, the most robust and accurate assembler was Raven, closely followed by Miniasm/Racon.

Highlights

  • The rapid development of whole-genome sequencing (WGS) technologies over the last decade has revolutionized the ability to monitor and trace outbreaks of infectious diseases [1]

  • This limitation is mainly attributed to the fact that short reads are not able to span the repetitive structures that extend beyond the maximum read length generated, producing unresolvable loops during genome assembly and resulting in an assembly consisting of many unordered contigs

  • Higher (p < 0.05) complete benchmarking universal single-copy orthologs (BUSCOs) were observed in the Miniasm/Racon and Raven assemblies compared to other assemblers (Supplementary Table S1), while the complete BUSCOs of the Raven assemblies were significantly higher (p < 0.05) than those of the Miniasm/Racon assemblies

Read more

Summary

Introduction

The rapid development of whole-genome sequencing (WGS) technologies over the last decade has revolutionized the ability to monitor and trace outbreaks of infectious diseases [1]. Illumina short-read sequencing has been widely used for sequencing bacterial pathogens, which can produce millions of short reads (

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call