Accelerating Alignment for Short Reads Allowing Insertion of Gaps on Multi-Core Cluster

Yongjie Yang,Cheng Zhong,Danyang Chen

doi:10.1109/pdcat46702.2019.00019

Abstract

The sequence alignment is an important basic work in analyzing large biological data. For the massive short reads alignment problem, based on the dynamic programming approach, divide and conquer principle, and FUSE kernel module, a parallel short-read alignment method allowing the optimal number of inserting gaps depending on species and sequence length is developed on multi-core cluster. The experimental results on real and synthetic data show that the proposed parallel alignment method can achieve good speedup with the same alignment accuracy as the sequential alignment method. Compared with the existing parallel alignment method, the proposed method can remarkably reduce the time of partitioning reference genome and reads files and accelerate the large-scale short-read alignment.

Full Text