SWMapper: Scalable Read Mapper on SunWay TaihuLight

Kai Xu,Xin Li,Bertil Schmidt,Xiaohui Duan,Weiguo Liu,Xiangxu Meng

doi:10.1145/3404397.3404445

Abstract

With the rapid development of next-generation sequencing (NGS) technologies, high throughput sequencing platforms continuously produce large amounts of short read DNA data at low cost. Read mapping is a performance-critical task, being one of the first stages required for many different types of NGS analysis pipelines. We present SWMapper — a scalable and efficient read mapper for the Sunway TaihuLight supercomputer. A number of optimization techniques are proposed to achieve high performance on its heterogeneous architecture which are centered around a memory-efficient succinct hash index data structure including seed filtration, duplicate removal, dynamic scheduling, asynchronous data transfer, and overlapping I/O and computation. Furthermore, a vectorized version of the banded Myers algorithm for pairwise alignment with 256-bit vector registers is presented to fully exploit the computational power of the SW26010 processor. Our performance evaluation shows that SWMapper using all 4 compute groups of a single Sunway TaihuLight node outperforms S-Aligner on the same hardware by a factor of 6.2. In addition, compared the state-of-the-art CPU-based mappers RazerS3, BitMapper2, and Hobbes3 running on a 4-core Xeon W-2123v3 CPU, SWMapper achieves speedups of 26.5, 7.8, and 2.6, respectively. Our optimizations achieve an aggregated speedup of 11 compared to the naive implementation on one compute group of an SW26010 processor as well as a strong scaling efficiency of 74% on 128 compute groups.

Full Text