Ultra-Fast Next Generation Human Genome Sequencing Data Processing Using DRAGEN&amp;lt;sup&amp;gt;TM&amp;lt;/sup&amp;gt; Bio-IT Processor for Precision Medicine

Amit Goyal,Sunghoon Lee,Reena Garg,Hyuk Jung Kwon,Yoon Hee Kim,Kichan Lee,Min Seob Lee,Seon Young Yun

doi:10.4236/ojgen.2017.71002

Abstract

Slow speed of the Next-Generation sequencing data analysis, compared to the latest high throughput sequencers such as HiSeq X system, using the current industry standard genome analysis pipeline, has been the major factor of data backlog which limits the real-time use of genomic data for precision medicine. This study demonstrates the DRAGEN Bio-IT Processor as a potential candidate to remove the “Big Data Bottleneck”. DRAGENTM accomplished the variant calling, for ~40× coverage WGS data in as low as ~30 minutes using a single command, achieving the over 50-fold data analysis speed while maintaining the similar or better variant calling accuracy than the standard GATK Best Practices workflow. This systematic comparison provides the faster and efficient NGS data analysis alternative to NGS-based healthcare industries and research institutes to meet the requirement for precision medicine based healthcare.

Highlights

With the emergence of the 2nd generation high throughput Generation Sequencing (NGS) platforms as well as accurate and consistent identification of the genomic variants, the use of the personal genome sequencing information for the diagnostic and prognostic purpose has become the reality [1] [2]
Slow speed of the Next-Generation sequencing data analysis, compared to the latest high throughput sequencers such as HiSeq X system, using the current industry standard genome analysis pipeline, has been the major factor of data backlog which limits the real-time use of genomic data for precision medicine
The variant calling efficiencies of the two pipelines were evaluated by comparing variants with the GIABv2.19 high confidence call-set [12] [13]. These studies demonstrate that the employment of the DRAGEN Bio-IT processor decreased the Whole Genome Sequencing (WGS) Next Generation Sequencing (NGS)-data analysis time to just ~40 minute while achieving the equivalent or better genotype variant calling accuracy than the standard Genome Analysis Toolkit (GATK) Best Practices workflow

Summary

Introduction

With the emergence of the 2nd generation high throughput Generation Sequencing (NGS) platforms as well as accurate and consistent identification of the genomic variants, the use of the personal genome sequencing information for the diagnostic and prognostic purpose has become the reality [1] [2]. The most commonly used Genome Analysis Toolkit (GATK) best practice pipelines requires several hours to several days to analyze one human whole genome sequencing data, depending on the available processors. Several cloud-based solutions, such as GenomePilot by Appistry [9], etc., to accelerate NGS-data analysis platform to speed-up the analysis has been introduced This conventional cluster approach requires expensive computer system, maintenance and monitoring. The variant calling efficiencies of the two pipelines were evaluated by comparing variants with the GIABv2.19 high confidence (truth) call-set [12] [13] These studies demonstrate that the employment of the DRAGEN Bio-IT processor decreased the WGS NGS-data analysis time to just ~40 minute while achieving the equivalent or better genotype variant calling accuracy than the standard GATK Best Practices workflow

Sequence Data-Set and GIAB Validation Call-Set

GATK Best Practices Workflow

DRAGEN Bio-IT Processor and DRAGEN Genome Pipelines

Performance Assessment of the Two Variant Calling Pipelines

Research Scheme

Runtime Performance of the Genome Analysis Pipelines

Variant Calling Accuracy of the WGS Variant Calling Pipelines

Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Open Journal of Genetics	Publication Date: Jan 1, 2017
Citations: 29	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Ultra-Fast Next Generation Human Genome Sequencing Data Processing Using DRAGEN&lt;sup&gt;TM&lt;/sup&gt; Bio-IT Processor for Precision Medicine

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Open Journal of Genetics

Lead the way for us

Similar Papers

Evaluation of an optimized germline exomes pipeline using BWA-MEM2 and Dragen-GATK tools.
Nofe Alganmi ... Heba Abusamra
PLOS ONE | VOL. 18
Nofe Alganmi, et. al.Nofe Alganmi ... Heba Abusamra
03 Aug 2023
PLOS ONE | VOL. 18

An aid to the preliminary identification of non-sporing anaerobes.
H R Ingham ... J B Selkon
Journal of clinical pathology | VOL. 31
H R Ingham, et. al.H R Ingham ... J B Selkon
01 Aug 1978
Journal of clinical pathology | VOL. 31

ECNano: A cost-effective workflow for target enrichment sequencing and accurate variant calling on 4800 clinically significant genes using a single MinION flowcell
Amy Wing-Sze Leung ... Wui-Wang Lui
BMC Medical Genomics | VOL. 15
Amy Wing-Sze Leung, et. al.Amy Wing-Sze Leung ... Wui-Wang Lui
04 Mar 2022
BMC Medical Genomics | VOL. 15

Edge effects in calling variants from targeted amplicon sequencing.
Ravi Vijaya Satya ... John Dicarlo
BMC Genomics | VOL. 15
Ravi Vijaya Satya, et. al.Ravi Vijaya Satya ... John Dicarlo
01 Jan 2014
BMC Genomics | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ultra-Fast Next Generation Human Genome Sequencing Data Processing Using DRAGEN&amp;lt;sup&amp;gt;TM&amp;lt;/sup&amp;gt; Bio-IT Processor for Precision Medicine

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Open Journal of Genetics

Ultra-Fast Next Generation Human Genome Sequencing Data Processing Using DRAGEN<sup>TM</sup> Bio-IT Processor for Precision Medicine