Abstract

BackgroundOne of the main goals in cancer studies including high-throughput microRNA (miRNA) and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome. Both mRNA and miRNA expression changes in cancer diseases are described to reflect clinical characteristics like staging and prognosis. Furthermore, miRNA abundance can directly affect target transcripts and translation in tumor cells. Prediction models are trained to identify either mRNA or miRNA signatures for patient stratification. With the increasing number of microarray studies collecting mRNA and miRNA from the same patient cohort there is a need for statistical methods to integrate or fuse both kinds of data into one prediction model in order to find a combined signature that improves the prediction.ResultsHere, we propose a new method to fuse miRNA and mRNA data into one prediction model. Since miRNAs are known regulators of mRNAs we used the correlations between them as well as the target prediction information to build a bipartite graph representing the relations between miRNAs and mRNAs. This graph was used to guide the feature selection in order to improve the prediction. The method is illustrated on a prostate cancer data set comprising 98 patient samples with miRNA and mRNA expression data. The biochemical relapse was used as clinical endpoint. It could be shown that the bipartite graph in combination with both data sets could improve prediction performance as well as the stability of the feature selection.ConclusionsFusion of mRNA and miRNA expression data into one prediction model improves clinical outcome prediction in terms of prediction error and stable feature selection. The R source code of the proposed method is available in the supplement.

Highlights

  • One of the main goals in cancer studies including high-throughput microRNA and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome

  • The correlation coefficient can be tested for a significant shift from zero leading to a p-value for every mRNA-miRNA pair pci,ojr = P(H0 : ρ(mi, mij) = 0)

  • Combined prediction models involving mRNA and miRNA expression data should include the relations between the different features in the model

Read more

Summary

Introduction

One of the main goals in cancer studies including high-throughput microRNA (miRNA) and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome. With the increasing number of microarray studies collecting mRNA and miRNA from the same patient cohort there is a need for statistical methods to integrate or fuse both kinds of data into one prediction model in order to find a combined signature that improves the prediction. High throughput techniques, such as gene expression arrays, have made it possible to identify biomarkers and gene signatures for a wide range of diseases. When integrating data from different levels, properties and scales have to be taken into account as well as the relations between the different types of features

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call