Abstract
Abstract Background Nanopore sequencing is a rapidly developing third-generation sequencing technology, which can generate long nucleotide reads of molecules within a portable device in real time. Through detecting the change of ion currency signals during a DNA/RNA fragment’s pass through a nanopore, genotypes are determined. Currently, the accuracy of nanopore base-calling has a higher error rate than short-read base-calling. Through utilizing deep neural networks, the-state-of-the art nanopore base-callers achieve base-calling accuracy in a range from 85% to 95%. Result In this work, we proposed a novel base-calling approach from a perspective of instance segmentation. Different from the previous sequence labeling approaches, we formulated the base-calling problem as a multi-label segmentation task. Meanwhile, we proposed a refined U-net model which we call UR-net that can model sequential dependencies for a one-dimensional segmentation task. The experiment results show that the proposed base-caller URnano achieves competitive results compared to recently proposed CTC-featured base-caller Chiron, on the same amount of training and test data for in-domain evaluation. Our results show that formulating the base-calling problem as a one-dimensional segmentation task is a promising approach. Availability The source code and data are available at https://github.com/yaozhong/URnano Contact yaozhong@ims.u-tokyo.ac.jp Supplementary information Supplementary data are available at attachment online.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have