Abstract
BackgroundThe introduction of next-generation sequencing (NGS) into molecular cancer diagnostics has led to an increase in the data available for the identification and evaluation of driver mutations and for defining personalized cancer treatment regimens. The meaningful combination of omics data, ie, pathogenic gene variants and alterations with other patient data, to understand the full picture of malignancy has been challenging.ObjectiveThis study describes the implementation of a system capable of processing, analyzing, and subsequently combining NGS data with other clinical patient data for analysis within and across institutions.MethodsOn the basis of the already existing NGS analysis workflows for the identification of malignant gene variants at the Institute of Pathology of the University Hospital Erlangen, we defined basic requirements on an NGS processing and analysis pipeline and implemented a pipeline based on the GEMINI (GEnome MINIng) open source genetic variation database. For the purpose of validation, this pipeline was applied to data from the 1000 Genomes Project and subsequently to NGS data derived from 206 patients of a local hospital. We further integrated the pipeline into existing structures of data integration centers at the University Hospital Erlangen and combined NGS data with local nongenomic patient-derived data available in Fast Healthcare Interoperability Resources format.ResultsUsing data from the 1000 Genomes Project and from the patient cohort as input, the implemented system produced the same results as already established methodologies. Further, it satisfied all our identified requirements and was successfully integrated into the existing infrastructure. Finally, we showed in an exemplary analysis how the data could be quickly loaded into and analyzed in KETOS, a web-based analysis platform for statistical analysis and clinical decision support.ConclusionsThis study demonstrates that the GEMINI open source database can be augmented to create an NGS analysis pipeline. The pipeline generates high-quality results consistent with the already established workflows for gene variant annotation and pathological evaluation. We further demonstrate how NGS-derived genomic and other clinical data can be combined for further statistical analysis, thereby providing for data integration using standardized vocabularies and methods. Finally, we demonstrate the feasibility of the pipeline integration into hospital workflows by providing an exemplary integration into the data integration center infrastructure, which is currently being established across Germany.
Highlights
BackgroundCombining omics data with other clinical patient data has been the focus of multiple studies in the past years [1,2,3]
This study demonstrates that the GEnome MINIng GUI (GEMINI) open source database can be augmented to create an next-generation sequencing (NGS) analysis pipeline
We further demonstrate how NGS-derived genomic and other clinical data can be combined for further statistical analysis, thereby providing for data integration using standardized vocabularies and methods
Summary
BackgroundCombining omics data with other clinical patient data has been the focus of multiple studies in the past years [1,2,3]. The emergence and widespread use of next-generation sequencing (NGS) for the identification of pathological gene variants has led to an increase in the amount of data available for diagnosis. This has directly improved the quality of medical care for many diseases [4,5,6]. The OHDSI-OMOP (Observational Health Data Sciences and Informatics-OMOP) common data model (CDM) focuses on observational research, and i2b2 focuses on the integration of different types of data into 1 clinical repository These systems require an extra genomics pipeline to be run before the data can be loaded and integrated into the data repositories. The meaningful combination of omics data, ie, pathogenic gene variants and alterations with other patient data, to understand the full picture of malignancy has been challenging
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have