Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.

Lingjian Yang,Chrysanthi Ainali,Sophia Tsoka,Lazaros G Papageorgiou

doi:10.1186/s12859-014-0390-2

Lingjian Yang, Chrysanthi Ainali + Show 2 more

Open Access

PDF Available

https://doi.org/10.1186/s12859-014-0390-2

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundApplying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies.ResultsA supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile.ConclusionsThe model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0390-2) contains supplementary material, which is available to authorized users.

Highlights

Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome
Using a number of published gene expression profile datasets, we show that this pathway activity inference method is robust in terms of the number of constituent genes allowed to determine the pathway activity metric
The DIGS model can identify a subset of pathway constituent genes with cardinality no more than the userspecified value, NoG, whose expression can be combined via different weights to best separate samples from different phenotypes

Summary

Introduction

Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. The gene expression matrix serves as input to a classification task where each sample is allocated to a relevant phenotypic class via specific gene signatures or biomarkers that can best differentiate between outcomes. Such disease classification tasks have been successful in deriving biomarkers for diagnosis [3], prognosis [4,5,6,7] and response to treatment [8,9] in complex disorders. A classifier can be trained on the reduced feature set to predict the disease status or prognostic characteristic of any given samples [14,15,16,17]

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Dec 1, 2014
Citations: 8	License type: CC BY 4.0

R Discovery Prime

Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Inference of brain pathway activities for Alzheimer's disease classification.
Jongan Lee ... Jong-Won Kim
BMC Medical Informatics and Decision Making | VOL. Suppl 15 1
Jongan Lee, et. al.Jongan Lee ... Jong-Won Kim
20 May 2015
BMC Medical Informatics and Decision Making | VOL. Suppl 15 1

DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference
Yan Jiao ... Gargi S Patel
BMC Bioinformatics | VOL. 12
Yan Jiao, et. al.Yan Jiao ... Gargi S Patel
19 Oct 2011
BMC Bioinformatics | VOL. 12

Abstract C020: Quantitative measurements of functional activity of the TGFβ and MAPK-AP1 pathways in colon cancer provides information on their role in cancer development and metastasis
Yvonne Wesseling-Rozendaal ... Paul Van Swinderen
Molecular Cancer Therapeutics | VOL. 18
Yvonne Wesseling-Rozendaal, et. al.Yvonne Wesseling-Rozendaal ... Paul Van Swinderen
01 Dec 2019
Molecular Cancer Therapeutics | VOL. 18

Abstract 1052: Identification of signal transduction pathway activity in patient-derived xenograft models in comparison with their originating clinical samples of a variety of human cancer types
Wim Verhaegh ... Manuel Landesfeind
Cancer Research | VOL. 78
Wim Verhaegh, et. al.Wim Verhaegh ... Manuel Landesfeind
01 Jul 2018
Cancer Research | VOL. 78

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC bioinformatics