Accelerating next generation sequencing data analysis with system level optimizations

Nagarajan Kathiresan,Hakeem Almabrazi,Najeeb Syed,Rashid Al-Ali,Puthen V. Jithesh,Ramzi Temanni

doi:10.1038/s41598-017-09089-1

Nagarajan Kathiresan, Hakeem Almabrazi + Show 4 more

Open Access

https://doi.org/10.1038/s41598-017-09089-1

Copy DOI

Journal: Scientific Reports	Publication Date: Aug 22, 2017
Citations: 18	License type: open-access

Affiliation: Sidra Medical and Research Center

Abstract

Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default ’on-demand’ mode of CPU frequency is over-clocked by using ’performance-mode’ to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.

Highlights

The impact of generation sequencing (NGS) technologies in revolutionizing the biological and clinical sciences has been unprecedented[1, 2]
The execution time of genome alignment using Burrows-Wheeler Aligner (BWA) can be improved by parallelization that includes: (a) thread-parallelization by using multi-threads[12], (b) data-parallelization by splitting the input into distinct chunks or intermediate data followed by processing the chunks one-by-one within or across the node[13], and (c) data-parallel with concurrent execution by splitting the data into disjoint chunks distributing the Biomedical Informatics, Research Branch, Sidra Medical and Research Center, Post Box No 26999, Doha, Qatar
Non-Uniform Memory Access (NUMA) based multi-CPU is the features of modern High Performance Computing (HPC) architecture and more than 2 Terabyte of main memory can be possible within a single node

Summary

Introduction

The impact of generation sequencing (NGS) technologies in revolutionizing the biological and clinical sciences has been unprecedented[1, 2]. In addition to data-parallelization (e.g. distribution of independent chunks of data across the CPUs), concurrent parallelization (e.g. multi-threading) is implemented in multi-core CPUs of modern HPC systems[5, 14] These types of BWA optimizations are done in our earlier paper[3, 14] in the traditional HPC system. These implementations simulate in-memory computing concept, which may bring better performance benefit but fails in resource utilization[14] To address this issue, we proposed optimization of data intensive computing model by using optimal number of multi-threads for every sample execution and processing multiple samples within a node in an empirically parallel manner[3]. Most of the variant discovery algorithms fail to scale-up on multi-core HPC systems and this results in multi-threading overhead, poor scalability and underutilization of HPC resources[16] To address these challenges, data-parallelization and pipeline parallel execution models are implemented in ref. The cache fusion was used to improve the performance of genome alignment, choke elimination was used to eliminate the waiting time in the workflow and the framework of merged portion algorithm was invoked for better performance and optimal resource utilization in an optimized data portion model[17]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accelerating next generation sequencing data analysis with system level optimizations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Challenges and Strategies for Next Generation Sequencing (NGS) Data Analysis
Vivek Thakur
Journal of Computer Science & Systems Biology | VOL. 03
Vivek ThakurVivek Thakur
01 Jan 2009
Journal of Computer Science & Systems Biology | VOL. 03

Next Generation Sequencing Data Analysis Evaluation in Patients with Parkinsonism from a Genetically Isolated Population
Radek Vodicka ... Martin Prochazka
Genomics and Computational Biology | VOL. 3
Radek Vodicka, et. al.Radek Vodicka ... Martin Prochazka
11 May 2017
Genomics and Computational Biology | VOL. 3

Next Generation Sequencing Data Analysis and its Applications in Agriculture
Shbana Begum ... Rahul Banerjee
Bhartiya Krishi Anusandhan Patrika | VOL. -
Shbana Begum, et. al.Shbana Begum ... Rahul Banerjee
26 Jun 2021
Bhartiya Krishi Anusandhan Patrika | VOL. -

ZSeq 2.0: A fully automatic preprocessing method for next generation sequencing data
Abed Alkhateeb ... Luis Rueda
-
Abed Alkhateeb, et. al.Abed Alkhateeb ... Luis Rueda
01 Nov 2015
01 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accelerating next generation sequencing data analysis with system level optimizations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports