Genome Modeling System: A Knowledge Management Platform for Genomics.

Malachi Griffith,Richard K Wilson,Edward A Belter,Joshua B Peck,Matthew B Callaway,Kyung H Kim,Xiaoqi Shi,Matthew R Weil,Indraniel Das,Anthony M Brummett,Justin T Lolofie,Christopher A Miller,Daniel C Koboldt,Ken Chen,Xian Fan,Christopher A Maher,Feiyu Du,Amy E Hawkins,Jasreet Hundal,James V Weible,Benjamin J Ainscough,James M Eldred,Michael D Mclellan,Scott M Smith,David E Larson,Mark M Burnett,Thomas P Mooney,Li Ding,David J Dooling,Todd G Hepler,Zachary L Skidmore,Nicole Maher,Richard W Wohlstadter,Charles Lu,Brian R Derickson,Cyriac Kandoth,Nathan D Dees,Elaine R Mardis,Joshua F Mcmichael,Travis E Abbott,Nathaniel G Nutter,Ian T Ferguson,Lynn K Carmichael,Michael J Kiwala,Avinash Ramu,Gabriel E Sanderson,Gary Stiehr,David Morton ,Jason Walker ,Shawn Leonard ,William Schierding ,Vincent Magrini ,Benjamin J Oberkfell ,Todd Wylie ,Adam Coffman ,Allison Regier ,Benjamin Abbott ,Craig Pohl ,W.e Schroeder ,Adam F Dukes ,R.l Long ,Eric M Clark ,Christopher Harris

doi:10.1371/journal.pcbi.1004274

Malachi Griffith, Richard K Wilson + Show 61 more

Open Access

https://doi.org/10.1371/journal.pcbi.1004274

Copy DOI

Journal: PLOS Computational Biology	Publication Date: Jul 9, 2015
Citations: 100	License type: CC BY 4.0

Affiliation: Washington University in St. Louis

Abstract

In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

Highlights

The increasing sequence data output of massively parallel sequencing platforms [1] has allowed the application of sequencing to an incredible diversity of research projects in the biological, genomic, and medical fields [2,3,4,5,6]
A typical genome analysis using the Genome Modeling System (GMS) might start from any combination of whole-genome, exome or RNA-seq data and produce alignments against a reference genome, somatic variant calls including single nucleotide variants (SNVs), structural variants (SVs), copy-number variants (CNVs), transcript expression levels, RNA fusion predictions, and more
To address challenges of scale, tracking, optimization, and reproducibility, we have developed an analysis information management system called the Genome Modeling System (GMS)

Summary

Introduction

The increasing sequence data output of massively parallel sequencing platforms [1] has allowed the application of sequencing to an incredible diversity of research projects in the biological, genomic, and medical fields [2,3,4,5,6]. Data are processed through various analysis pipelines (e.g., reference alignment, somatic variation detection, etc.) that in turn are managed and monitored by a workflow system (Box 1).

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Genome Modeling System: A Knowledge Management Platform for Genomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Abstract P2-03-18: Discovery of novel amplified genes in primary breast cancer with copy number and gene expression analysis of whole exome and transcriptome sequencing data
Eunshin Lee ... Youngjoon Kang
Cancer Research | VOL. 75
Eunshin Lee, et. al.Eunshin Lee ... Youngjoon Kang
30 Apr 2015
Cancer Research | VOL. 75

Open pipelines for integrated tumor genome profiles reveal differences between pancreatic cancer tumors and cell lines.
Jeremy Goecks ... Bassel F El‐Rayes
Cancer medicine | VOL. 4
Jeremy Goecks, et. al.Jeremy Goecks ... Bassel F El‐Rayes
04 Jan 2015
Cancer medicine | VOL. 4

Abstract A1-44: Clinical cancer sequencing and integrated analysis of whole genomes, exomes and transcriptomes
Malachi Griffith ... Avinash Ramu
Cancer Research | VOL. 75
Malachi Griffith, et. al.Malachi Griffith ... Avinash Ramu
15 Nov 2015
Cancer Research | VOL. 75

Genomic profiling of multiple myeloma: New insights and modern technologies
Malin Hultcrantz ... Even H Rustad
Best Practice & Research Clinical Haematology | VOL. 33
Malin Hultcrantz, et. al.Malin Hultcrantz ... Even H Rustad
27 Jan 2020
Best Practice & Research Clinical Haematology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genome Modeling System: A Knowledge Management Platform for Genomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology