A performance comparison of data and memory allocation strategies for sequence aligners on NUMA architectures

Josefina Lenis,Miquel Angel Senar

doi:10.1007/s10586-017-1015-0

Josefina Lenis, Miquel Angel Senar

https://doi.org/10.1007/s10586-017-1015-0

Copy DOI

Journal: Cluster Computing	Publication Date: Jul 6, 2017
Citations: 9	License type: open-access

Affiliation: Autonomous University of Barcelona

Abstract

Over the last several years, many sequence alignment tools have appeared and become popular for the fast evolution of next generation sequencing technologies. Obviously, researchers that use such tools are interested in getting maximum performance when they execute them in modern infrastructures. Today’s NUMA (Non-uniform memory access) architectures present major challenges in getting such applications to achieve good scalability as more processors/cores are used. The memory system in NUMA systems shows a high complexity and may be the main cause for the loss of an application’s performance. The existence of several memory banks in NUMA systems implies a logical increase in latency associated with the accesses of a given processor to a remote bank. This phenomenon is usually attenuated by the application of strategies that tend to increase the locality of memory accesses. However, NUMA systems may also suffer from contention problems that can occur when concurrent accesses are concentrated on a reduced number of banks. Sequence alignment tools use large data structures to contain reference genomes to which all reads are aligned. Therefore, these tools are very sensitive to performance problems related to the memory system. The main goal of this study is to explore the trade-offs between data locality and data dispersion in NUMA systems. We have performed experiments with several popular sequence alignment tools on two widely available NUMA systems to assess the performance of different memory allocation policies and data partitioning strategies. We find that there is not one method that is best in all cases. However, we conclude that memory interleaving is the memory allocation strategy that provides the best performance when a large number of processors and memory banks are used. In the case of data partitioning, the best results are usually obtained when the number of partitions used is greater, sometimes combined with an interleave policy.

Highlights

New genomic sequencing technologies have made a dramatic breakthrough in the development of genomic studies
We extend our previous results by expanding our comparison study to two different Non-uniform memory access (NUMA) systems, one based on Intel Xeon and the other one based on AMD Opteron, and by introducing a novel hybrid execution strategy that combines both data partitioning and memory allocation policies
On Linux systems, this will normally involve spreading the threads throughout the system and using the first-touch data allocation policy, which means that, when a program is started on a CPU, data requested by that program will be stored on a memory bank corresponding to its local CPU [12]

Summary

Introduction

New genomic sequencing technologies have made a dramatic breakthrough in the development of genomic studies. The steady trend of reducing the sequencing cost and increasing the length of reads forces developers to create and maintain faster, updated and more accurate software. Sequence alignment tools have become essential for solving genomic variant calling studies. Numerous sequence alignment tools have been developed in recent years. They exhibit differences in sensitivity or accuracy [22] and most of them can execute in parallel on modern multicore systems. Writing parallel programs that exhibit good scalability on Non-uniform memory access (NUMA) architectures is far from easy. Achieving good system performance requires computations to be carefully designed in order to harmonize the execution of multiple threads and data accesses over multiple memory banks

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A performance comparison of data and memory allocation strategies for sequence aligners on NUMA architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cluster Computing

Lead the way for us

Similar Papers

A Memory Congestion-Aware MPI Process Placement for Modern NUMA Systems
Mulya Agung ... Hiroyuki Takizawa
-
Mulya Agung, et. al.Mulya Agung ... Hiroyuki Takizawa
01 Dec 2017
01 Dec 2017

The Art of Efficient In-memory Query Processing on NUMA Systems: a Systematic Approach
Puya Memarzia ... Suprio Ray
-
Puya Memarzia, et. al.Puya Memarzia ... Suprio Ray
01 Apr 2020
01 Apr 2020

NUMA-BTDM: A Thread Mapping Algorithm for Balanced Data Locality on NUMA Systems
Iulia Stirb
-
Iulia StirbIulia Stirb
01 Dec 2016
01 Dec 2016

NUMA Aware I/O in Virtualized Systems
Amitabha Banerjee ... Zach Shen
-
Amitabha Banerjee, et. al.Amitabha Banerjee ... Zach Shen
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A performance comparison of data and memory allocation strategies for sequence aligners on NUMA architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cluster Computing