Abstract

Long-read sequencing enables variant detection in genomic regions that are considered difficult-to-map by short-read sequencing. To fully exploit the benefits of longer reads, here we present a deep learning method NanoCaller, which detects SNPs using long-range haplotype information, then phases long reads with called SNPs and calls indels with local realignment. Evaluation on 8 human genomes demonstrates that NanoCaller generally achieves better performance than competing approaches. We experimentally validate 41 novel variants in a widely used benchmarking genome, which could not be reliably detected previously. In summary, NanoCaller facilitates the discovery of novel variants in complex genomic regions from long-read sequencing.

Highlights

  • Single-nucleotide polymorphisms (SNPs) and small insertions/deletions are two common types of genetic variants in human genomes

  • For SNP calling in NanoCaller, candidate SNP sites are selected according to the specified thresholds for minimum coverage and minimum frequency of alternative alleles

  • In the “Results” section, we present performances of five NanoCaller models: ONTHG001, Oxford Nanopore Technology (ONT)-HG002, circular consensus sequencing (CCS)-HG001, CCS-HG002, and Continuous Long Read Sequencing (CLR)-HG002; the first four datasets have both SNP and indel deep learning models, whereas CLR-HG002 consists of only a SNP model

Read more

Summary

Introduction

Single-nucleotide polymorphisms (SNPs) and small insertions/deletions (indels) are two common types of genetic variants in human genomes. Variant calling methods on short reads, such as GATK [1] and FreeBayes [2], achieved excellent performance to detect SNPs and small indels in genomic regions marked as traditional “high-confidence regions” in various benchmarking tests [3,4,5]. Since these methods were developed for short-read sequencing data with low per-base error

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call