Abstract

AbstractLong-read sequencing enables variant detection in genomic regions that are considered difficult-to-map by short-read sequencing. To fully exploit the benefits of longer reads, here we present a deep-learning method NanoCaller, which detects SNPs using long-range haplotype information, then phases long reads with called SNPs and calls indels with local realignment. Evaluation on 8 human genomes demonstrated that NanoCaller generally achieves better performance than competing approaches. We experimentally validated 41 novel variants in a widely-used benchmarking genome, which cannot be reliably detected previously. In summary, NanoCaller facilitates the discovery of novel variants in complex genomic regions from long- read sequencing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call