Abstract

Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients’ genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.

Highlights

  • Next-generation sequencing (NGS) technologies, such as whole-genome sequencing (WGS), whole-exome sequencing (WES), and/or targeted sequencing, are progressively more applied to biomedical study and medical practice to identify disease- and/or drug-associated genetic variants to advance precision medicine [1,2]

  • We give an overview of the challenges in processing genomic data and electronic health records (EHRs), provide possible solutions to overcome these challenges using approaches that ensure the safety of genomic data, and present a Big Data solution for identifying clinically actionable variants in sequence data

  • Alignment allows a number of quality control (QC) measures, such as the proportion of all reads aligned to a reference sequence, the ratio of unique reads aligned to a reference sequence, and the number of reads aligned at a specific locus

Read more

Summary

Introduction

Next-generation sequencing (NGS) technologies, such as whole-genome sequencing (WGS), whole-exome sequencing (WES), and/or targeted sequencing, are progressively more applied to biomedical study and medical practice to identify disease- and/or drug-associated genetic variants to advance precision medicine [1,2]. NGS technological advancements in clinical genome sequencing and the adoption of EHRs will very likely create patient-centered precision medicine in clinical practice. Genomic data generated by NGS technologies are a vital component in supporting genomic medicine, but the volume and complexity of the data raise challenges for its use in clinical practice [8]. We give an overview of the challenges in processing genomic data and EHRs, provide possible solutions to overcome these challenges using approaches that ensure the safety of genomic data, and present a Big Data solution for identifying clinically actionable variants in sequence data. We discuss the requirement for the efficient integration of genomic information into EHRs

Aims
Challenges in Manipulating Genomic Data
Challenges in Manipulating Clinical Data
Cloud Computing
Privacy and Security Challenges of Cloud Computing
NGS Read Alignment
Calling Variants
Variant Annotation
Statistical Analysis of Genomic Data
Security of Genomic Data
Clinically Actionable Genetic Variants
Big Data Analytics in Health Research
Health Informatics
Medical Imaging Analysis
Data Sharing
Findings
Discussion
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call