Abstract
Abstract Introduction: Identification of pathways that are associated with experimental gene expression profiles is critical for understanding the biological mechanisms in cancer studies. Current standard pathway analysis methods such as Over-representation Analysis and Gene Set Enrichment Analysis often only use the gene symbols of the annotated pathways and typically disregard the biological interactions. This approach is not sensitive to the dysregulation of central genes in the pathway networks whose perturbations may critically affect the biological mechanisms. Methods: In this study, we have developed a framework that leverages the networks of interactions from the KEGG database to infer the underlying pathways from high-throughput gene expression profiles of cancer data. This methodology uses novel network analysis models with optimized sensitivity towards genes with key positions in pathways, particularly cancer-genes. We have previously shown that our novel network models create a distinction between the topological position of cancer genes and non-cancer related genes in pathways. In this study, we have devised a statistical pipeline to incorporate the network evidences in refining standard pathway analysis models. We have also tested our model on a battery of multiple cancer datasets and incorporated synthetic data evaluation approaches to verify the performance of our model. Results: The results show that our network-based model is capable of detecting known and well-studied pathway associations when other methods fail to capture them. For example, our model identifies the PI3K-Akt signaling pathway from 274 samples (benign and malignant) from GSE9899 ovarian cancer dataset, when only a few number genes (19) are perturbed (adjusted p-value = 0.003). In comparison, over-representation analysis only produces an adjusted p-value of 0.15 using the same data for the same pathway. This observation outlines an instance where our model makes informative interpretations, given that the dysfunction of PI3K-Akt pathway is studied in ovarian cancer. On the same dataset, our methodology identifies the association of Ras signaling pathway with the ovarian cancer from only 7 perturbations (adjusted p-value = 0.034). In comparison, over-representation analysis produces an adjusted p-value of 1.00. Discussion: The network-based model of this study provides a new perspective for interpretation of high-throughput cancer data by accounting for the topological position of the genes with respect to their associated pathways. This methodology allows researchers to identify pathways that are associated with the perturbations. It also enables to evaluate the importance of the perturbation with regards to the organization of the pathways. This methodology is potentially beneficial to applications in biomarker discovery and drug target development. Citation Format: Pourya Naderi Yeganeh, M. Taghi Mostafavi. A comprehensive computational framework for interpretation of high-throughput cancer data using annotated pathways [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 2460.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have