Abstract

BackgroundSomatic mutations can be used as potential biomarkers for subtyping and predicting outcomes for cancer patients. However, cancer patients often carry many somatic mutations, which do not always concentrate on specific genomic loci, suggesting that the mutations may affect common pathways or gene interaction networks instead of common genes. The challenge is thus to identify the functional relationships among the mutations using multi-modal data. We developed a novel approach for integrating patient somatic mutation, transcriptome and clinical data to mine underlying functional gene groups that can be used to stratify cancer patients into groups with different clinical outcomes. Specifically, we use distance correlation metric to mine the correlations between expression profiles of mutated genes from different patients.ResultsWith this approach, we were able to cluster patients based on the functional relationships between the affected genes using their expression profiles, and to visualize the results using multi-dimensional scaling. Interestingly, we identified a stable subgroup of breast cancer patients that are highly enriched with ER-negative and triple-negative subtypes, and the somatic mutation genes they harbor were capable of acting as potential biomarkers to predict patient survival in several different breast cancer datasets, especially in ER-negative cohorts which has lacked reliable biomarkers.ConclusionsOur method provides a novel and promising approach for integrating genotyping and gene expression data in patient stratification in complex diseases.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2902-0) contains supplementary material, which is available to authorized users.

Highlights

  • Somatic mutations can be used as potential biomarkers for subtyping and predicting outcomes for cancer patients

  • Instead of directly working on the gene lists, we propose to examine the functional relationships of the significant mutation genes (SMG) between different patients based on functional genomics data

  • We developed a novel approach to integrate genomic, transcriptomic and clinical data of cancer patients, to compare somatic mutations of patients based on their functional relationships in the context of gene expression profiles, tackling the challenge of low overlapping of mutated genes among cancer patients

Read more

Summary

Introduction

Somatic mutations can be used as potential biomarkers for subtyping and predicting outcomes for cancer patients. The Cancer Genome Atlas (TCGA) project harbors comprehensive data ranging from genomic sequences, genetic variants, transcriptomic and proteomic data to clinical data for multiple types of human cancer tissues as well as normal tissues It is a great source for scientists to integrate data from different levels and mine the buried interaction among them, which will shed light on the understanding of. The breast cancer (BRCA) project in TCGA has identified three commonly mutated genes TP53, GATA3, and PI3KC but every patient has a much larger number of somatic mutations which cannot be summarized and compared even at the pathway level [1] It is of great interest in identifying the potential relationships between the mutated genes from different patients

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call