Abstract
Abstract Introduction: Despite our general understanding of the genetic information flow from DNA to RNA to protein, the integrated analysis of multi-level omics data in a biologically meaningful way remains a challenging task due to the complex and dynamic nature of disease associated changes in molecular regulatory mechanisms. A case in point is the often observed phenomenon that key signatures associated with the same phenotypic difference do not share much in common between the mRNA and proteomic data even when they are measured on a common set of samples. Methods: We developed a computational method, NetFill, that integrates prior knowledge into the analysis of expression data to identify a parsimonious set of molecular mediators to connect the mRNA and protein signatures with plausible pathways/networks. NetFill formulates the problem mathematically as a flow-based linear programming under constraints to identify mediator/proteins in the protein-protein interactome network with the minimal cost cumulated from selected nodes and selected edges to link the mRNA and protein signatures. Data dependent node score and edge score were computed from the mRNA and proteomic expression data using Bayes' rule. By assuming conditional independence between different types of evidence, the node score was derived from the p values of t-tests between two conditions. Edge score was derived from the p-value of correlation between two proteins and edge confidence score from static PPI network. In the current implementation of NetFill, nine protein-protein interaction databases (BIND, BioGRID, DIP, HPRD, IntAct, MINT, MIPS, PDZBase, and Ractome) and three pathway databases (KEGG, Biocarta, and NCI) were combined to generate a large protein-protein interactome network with 101,398 interactions. To assess the confidence score of selected interactions, the network search algorithm was applied on 100 bootstrap with replacement samples of the mRNA and protein data. The final output of NetFill was a high-confidence network in which all edges had occurred in > 50% of the bootstrap sampling generated networks. Results: Resulted from separate consensus clustering analyses, an mRNA signature of 213 genes and a proteomic signature of 73 proteins for a common subtype/cluster of ovarian high-grade serous carcinoma (HGSC) samples had only 40 overlapping gene/proteins. NetFill identified a parsimonious network with 142 additional mediators to link the mRNA and proteomic signatures. Functional annotation revealed that Wnt signaling pathway (p = 0.01) and MAPK signaling pathway (p = 0.03) were significantly enriched in the identified network, indicating that these pathways could be potentially involved in this subtype of ovarian HGSC. Conclusion: The NetFill method and reported example demonstrated that existing knowledge of plausible interactions among genes and proteins could be used to help to functionally integrate mRNA and protein data in multi-level omics characterization of samples. Citation Format: Li Chen, Bai Zhang, Hui Zhang, Yue Wang, Daniel W. Chan, Zhen Zhang. NetFill: A network-based method to identify a parsimonious set of mediators to link mRNA and protein signatures using existing knowledge. [abstract]. In: Proceedings of the AACR Special Conference on Computational and Systems Biology of Cancer; Feb 8-11 2015; San Francisco, CA. Philadelphia (PA): AACR; Cancer Res 2015;75(22 Suppl 2):Abstract nr B1-26.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.