Abstract

Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

Highlights

  • The application of novel supervised learning algorithms to large-scale transcriptomic data has the potential to transform conventional approaches for disease classification, personalized medicine and development of prognostic models

  • We suggest a novel method for large-scale transcriptomic data analysis called in silico Pathway Activation Network Decomposition Analysis

  • We demonstrate the performance of this method by using multiple paclitaxel breast cancer treatment data sets obtained from Gene Expression Omnibus (GEO)[13]

Read more

Summary

Introduction

The application of novel supervised learning algorithms to large-scale transcriptomic data has the potential to transform conventional approaches for disease classification, personalized medicine and development of prognostic models. A number of data normalization approaches have been proposed over the recent years[1,2], it remains difficult to achieve robust results over a group of independent data sets even when they are obtained from the same profiling platform[3] This may be explained by a range of biological factors, such as wide heterogeneity among individuals on the population basis, variance in the cell cycle stage of the cells used or a set of technical factors, such as sample preparation or batch variations in reagents. Oncofinder algorithm represents a halfway approach, where information about pathway topology is used to assign activation or repression roles of particular genes in the pathway and estimate its overall activation[12] Very helpful, these approaches cannot overcome other above-mentioned limitations, posing a need for development of the new large-scale analytical methodologies that infer complex transcriptomic changes more accurately into the network of biologically relevant signalling axes. Using neoadjuvant therapy pretreatment breast cancer data with known treatment outcome and receptor status (estrogen receptor and HER2), we show that iPANDA is capable of producing highly robust sets of pathway markers, which can be further used for stratification of samples into responder and non-responder groups

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call