Causal inferencing of -omics data from The Cancer Genome Atlas: Lung adenocarcinoma tumors for mechanistic disease characterization and feature engineering.

Nimisha Schneider,Scott Marshall,Sergey Korkhov,Renee Deehan,Alexis Foroozan

doi:10.1200/jco.2020.38.15_suppl.e21016

Abstract

e21016 Background: Advances in high throughput measurement technologies (-omics data) have made it possible to generate high complexity, high volume data for oncology research. Researchers are often confronted many more measurements than samples (p > > > n), which poses challenges for both modeling the complexity of disease at the molecular mechanism level, and overfitting when generating predictive models with complex data. Here, we applied a prior knowledge-driven approach to characterize and classify heavy versus light smokers with lung cancer from The Cancer Genome Atlas, an open source repository that catalogs, harmonizes and hosts -omics data collected from samples generously donated from cancer patients. Methods: We applied a reverse inferencing approach to systematically interrogate RNAseq measurements from tumor and control biopsies against a knowledgebase of directed gene networks curated from published experiments. If patterns observed in the data are significantly similar to those in a network, an inference about the directional activity of that network can be made; e.g., the increased transcriptional activity of NFKB. Our library was nucleated through an open sourced knowledge graph and enhanced with updated and relevant knowledge using the open sourced Biological Expression Language framework. Directed networks were either qualitatively scored and used to build disease models, or semi-quantitatively scored and used as classification features. Results: In LUAD tumors, we detected a pattern of gene signatures which indicated a tumor stem cell-like phenotype characterized by predicted decreases in the activity of pro-differentiation factors and an increased response to hypoxia. Analysis of patients with heavy ( > 40) versus light ( < 10) pack-year burden suggested an augmented dedifferentiation profile in heavy smokers. In this example, improved classification was observed through features compression through directed network scoring compared to using individual RNA measurements selected by filtration methods. Conclusions: In-silico analysis of lung cancer patient biopsies generated hypotheses implicating stem cell signaling in tumors, and a further stratification of this signal based on patient pack year burden. Mechanistic modeling may be a useful application to the overfitting problem often encountered with -omics data in translational studies. Data from other TCGA indications can be used to evaluate the consistency of this type of approach

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Causal inferencing of -omics data from The Cancer Genome Atlas: Lung adenocarcinoma tumors for mechanistic disease characterization and feature engineering.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology

Lead the way for us

Similar Papers

Object-oriented regression for building predictive models with high dimensional omics data from translational studies
Lue Ping Zhao ... Hamid Bolouri
Journal of Biomedical Informatics | VOL. 60
Lue Ping Zhao, et. al.Lue Ping Zhao ... Hamid Bolouri
10 Mar 2016
Journal of Biomedical Informatics | VOL. 60

Simultaneous Integration of Multi-omics Data Improves the Identification of Cancer Driver Modules.
Dana Silverbush ... Simona Cristea
Cell Systems | VOL. 8
Dana Silverbush, et. al.Dana Silverbush ... Simona Cristea
01 May 2019
Cell Systems | VOL. 8

Risk stratification for prostate cancer via the integration of omics data of The Cancer Genome Atlas
...
Translational cancer research | VOL. 7
, et. al. ...
11 Jun 2018
Translational cancer research | VOL. 7

Abstract PR02: Mutational analysis of head and neck squamous cell carcinoma stratified by smoking status
Farhad Ghasemi ... John W Barrett
Clinical Cancer Research | VOL. 26
Farhad Ghasemi, et. al.Farhad Ghasemi ... John W Barrett
15 Jun 2020
Clinical Cancer Research | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Causal inferencing of -omics data from The Cancer Genome Atlas: Lung adenocarcinoma tumors for mechanistic disease characterization and feature engineering.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology