Abstract

BackgroundCancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies. Large-scale screenings of multiple types of molecular aberrations (e.g., mutations, copy number variations, DNA methylations, gene expressions) become increasingly important in the prognosis and study of cancer. Consequently, a computational model integrating multiple types of information is essential for the analysis of the comprehensive data.ResultsWe propose an integrated modeling framework to identify the statistical and putative causal relations of various molecular aberrations and gene expressions in cancer. To reduce spurious associations among the massive number of probed features, we sequentially applied three layers of logistic regression models with increasing complexity and uncertainty regarding the possible mechanisms connecting molecular aberrations and gene expressions. Layer 1 models associate gene expressions with the molecular aberrations on the same loci. Layer 2 models associate expressions with the aberrations on different loci but have known mechanistic links. Layer 3 models associate expressions with nonlocal aberrations which have unknown mechanistic links. We applied the layered models to the integrated datasets of NCI-60 cancer cell lines and validated the results with large-scale statistical analysis. Furthermore, we discovered/reaffirmed the following prominent links: (1)Protein expressions are generally consistent with mRNA expressions. (2)Several gene expressions are modulated by composite local aberrations. For instance, CDKN2A expressions are repressed by either frame-shift mutations or DNA methylations. (3)Amplification of chromosome 6q in leukemia elevates the expression of MYB, and the downstream targets of MYB on other chromosomes are up-regulated accordingly. (4)Amplification of chromosome 3p and hypo-methylation of PAX3 together elevate MITF expression in melanoma, which up-regulates the downstream targets of MITF. (5)Mutations of TP53 are negatively associated with its direct target genes.ConclusionsThe analysis results on NCI-60 data justify the utility of the layered models for the incoming flow of cancer genomic data. Experimental validations on selected prominent links and application of the layered modeling framework to other integrated datasets will be carried out subsequently.

Highlights

  • Cancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies

  • Cancer is a systemic disease where alterations of various physiological processes drive the development and progression of malignancies (e.g., [1,2,3,4,5]). These alterations result from combinations of many cytogenetic/molecular aberrations such as large-scale karyotype changes (e.g., [6]), sequence alterations on protein-coding or regulatory regions (e.g., [7,9]), DNA copy number variations (e.g., [10]), epigenetic modification changes (e.g., [5,11]), alterations of mRNA (e.g., [12]), protein (e.g., [13]) and microRNA (e.g., [14]) expressions

  • In this study we considered the following types of layer 3 associations linking gene expressions with each type of molecular aberrations: segment copy number variation (CNV) on other chromosomes, mutations and DNA methylations of cancer-related genes obtained from OMIM

Read more

Summary

Introduction

Cancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies. A computational model integrating multiple types of information is essential for the analysis of the comprehensive data. Cancer is a systemic disease where alterations of various physiological processes drive the development and progression of malignancies (e.g., [1,2,3,4,5]) These alterations result from combinations of many cytogenetic/molecular aberrations such as large-scale karyotype changes (e.g., [6]), sequence alterations on protein-coding or regulatory regions (e.g., [7,9]), DNA copy number variations (e.g., [10]), epigenetic modification changes (e.g., [5,11]), alterations of mRNA (e.g., [12]), protein (e.g., [13]) and microRNA (e.g., [14]) expressions. Examples include probabilistic Bayesian models [34], probabilistic relational models [35], mutual information networks [36], module networks [37] and factor graphs ([38,39])

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.