Abstract
Recent technological evolutions have led to an exponential increase in data in all the omics fields. It is expected that integration of these different data sources, will drastically enhance our knowledge of the biological mechanisms behind genomic diseases such as cancer. However, the integration of different omics data still remains a challenge. In this work we propose an intuitive workflow for the integrative analysis of expression, mutation and copy number data taken from the METABRIC study on breast cancer. First, we present evidence that the expression profile of many important breast cancer genes consists of two modes or ‘regimes’, which contain important clinical information. Then, we show how the co-occurrence of these expression regimes can be used as an association measure between genes and validate our findings on the TCGA-BRCA study. Finally, we demonstrate how these co-occurrence measures can also be applied to link expression regimes to genomic aberrations, providing a more complete, integrative view on breast cancer. As a case study, an integrative analysis of the identified MLPH-FOXA1 association is performed, illustrating that the obtained expression associations are intimately linked to the underlying genomic changes.ReviewersThis article was reviewed by Dirk Walther, Francisco Garcia and Isabel Nepomuceno.
Highlights
Systems genetics approaches that collect genomic information with matching transcript information from phenotypically well characterized individuals provide a powerful way to study the molecular mechanisms underlying complex phenotypes
Using a breast cancer dataset, we demonstrate that measures that count the co-occurrence of these expression regimes between different genes (‘co-occurrence measures’) are a suitable association measure for the analysis of expression data
We found that about 60% of the genes in the METABRIC study (114,652 out of 24,630 different transcripts measured), displayed a multimodal behavior. We show that these modes or ‘regimes’ of a gene convey important clinical information and can be associated to underlying genomic changes
Summary
Systems genetics approaches that collect genomic information with matching transcript information from phenotypically well characterized individuals provide a powerful way to study the molecular mechanisms underlying complex phenotypes. For this reason systems genetics approaches have become increasingly popular in the domain of cancer genomics [1,2,3]. A fundamental problem when integrating expression data with genomic information lies in the different nature of both datasets. While expression data is quantitative, consisting of continuous values that indicate the degree to which a gene in a sample is being transcribed, genomic data is essentially qualitative. A common way to deal with this problem is to convert the continuous expression measurements into more qualitative, discrete values
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.