Abstract

BackgroundSequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample.ResultsHere, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements.ConclusionsThe 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.

Highlights

  • Advances in DNA sequencing enable the rapid analysis of genomes, driving biological discovery

  • 2 Long-read methods for sequencing and assembly of a plant genome the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs

  • The resulting assembly consisted of 1,631,183 contigs totaling 864 megabase pairs (Mb) in length and contained 15,583 contigs >10 kb with a total length of 338 Mb (Supplementary Table S3)

Read more

Summary

Introduction

Advances in DNA sequencing enable the rapid analysis of genomes, driving biological discovery. We report a comparison of 3 long-read sequencing methods applied to the de novo sequencing of a plant, Macadamia jansenii. This is a rare species that is a close relative of the macadamia nut recently domesticated in Hawaii and Australia. The species was discovered as a single population of ∼60 plants in the wild in eastern Australia [2] This is a flowering plant (angiosperm) in the Proteaceae family that is basal to the large eudicot branch of the flowering plant phylogeny [3]. Results: Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call