Highly accurate long-read HiFi sequencing data for five complex genomes

Ting Hon,Steven J Knapp,Kristin Mars,Joseph W Karalius,Michael A Hardigan,David Kudrna,Yu-Chih Tsai,Doreen Ware,Cynthia C Steiner,Greg Young,Jane M Landolin,Paul Peluso,Beth Shapiro,David R Rank,Nicholas Maurer

doi:10.1038/s41597-020-00743-4

Abstract

The PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10–25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Data	Publication Date: Nov 17, 2020
Citations: 190	License type: open-access

R Discovery Prime

R Discovery Prime

Highly accurate long-read HiFi sequencing data for five complex genomes

Abstract

Talk to us

Similar Papers

More From: Scientific Data

Lead the way for us

Similar Papers

Highly accurate long reads are crucial for realizing the potential of biodiversity genomics
Scott Hotaling ... Paul B Frandsen
BMC genomics | VOL. 24
Scott Hotaling, et. al.Scott Hotaling ... Paul B Frandsen
16 Mar 2023
BMC genomics | VOL. 24

Benchmarking datasets for assembly-based variant calling using high-fidelity long reads
Hyunji Lee ... Junho Lee
BMC genomics | VOL. 24
Hyunji Lee, et. al.Hyunji Lee ... Junho Lee
27 Mar 2023
BMC genomics | VOL. 24

Cancer genomics: new software tools making sequencing more accessible.
En-Guo Chen ... Yan Lu
Personalized Medicine | VOL. 11
En-Guo Chen, et. al.En-Guo Chen ... Yan Lu
01 Mar 2014
Personalized Medicine | VOL. 11

Telomere-to-telomere assembly of diploid chromosomes with Verkko.
Mikko Rautiainen ... Sergey Koren
Nature biotechnology | VOL. 41
Mikko Rautiainen, et. al.Mikko Rautiainen ... Sergey Koren
16 Feb 2023
Nature biotechnology | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Highly accurate long-read HiFi sequencing data for five complex genomes

Abstract

Talk to us

Similar Papers

More From: Scientific Data