Abstract

BackgroundAcute Myeloid Leukemia (AML) is characterized by various cytogenetic and molecular abnormalities. Detection of these abnormalities is important in the risk-classification of patients but requires laborious experimentation. Various studies showed that gene expression profiles (GEP), and the gene signatures derived from GEP, can be used for the prediction of subtypes in AML. Similarly, successful prediction was also achieved by exploiting DNA-methylation profiles (DMP). There are, however, no studies that compared classification accuracy and performance between GEP and DMP, neither are there studies that integrated both types of data to determine whether predictive power can be improved.ApproachHere, we used 344 well-characterized AML samples for which both gene expression and DNA-methylation profiles are available. We created three different classification strategies including early, late and no integration of these datasets and used them to predict AML subtypes using a logistic regression model with Lasso regularization.ResultsWe illustrate that both gene expression and DNA-methylation profiles contain distinct patterns that contribute to discriminating AML subtypes and that an integration strategy can exploit these patterns to achieve synergy between both data types. We show that concatenation of features from both data sets, i.e. early integration, improves the predictive power compared to classifiers trained on GEP or DMP alone. A more sophisticated strategy, i.e. the late integration strategy, employs a two-layer classifier which outperforms the early integration strategy.ConclusionWe demonstrate that prediction of known cytogenetic and molecular abnormalities in AML can be further improved by integrating GEP and DMP profiles.

Highlights

  • Over the last decade, microarray technologies that measures gene expression profiles (GEP) proved to be effective for the detection of biomarkers for diagnosis and prognosis of disease and for helping with the determination of drug treatment [1]

  • We illustrate that both gene expression and DNA-methylation profiles contain distinct patterns that contribute to discriminating Acute Myeloid Leukemia (AML) subtypes and that an integration strategy can exploit these patterns to achieve synergy between both data types

  • We show that concatenation of features from both data sets, i.e. early integration, improves the predictive power compared to classifiers trained on GEP or DNA-methylation profiles (DMP) alone

Read more

Summary

Introduction

Microarray technologies that measures gene expression profiles (GEP) proved to be effective for the detection of biomarkers for diagnosis and prognosis of disease and for helping with the determination of drug treatment [1]. It has already been demonstrated that some subtypes can be predicted very well using GEP, e.g. t(8;21), t(15;17), inv (16), or CEBPAdouble-mutant [7], challenges lie ahead for subtypes such as patients carrying abnormalities in FLT3, NPM1, or with certain chromosomal abnormalities (3q, 11q23, 5q, 7q) [8], where it is much more difficult to predict these accurately. This is an indication that gene expression profiles do not contain features that are sufficiently discriminative to distinguish those groups of patients from the others. We created three different classification strategies including early, late and no integration of these datasets and used them to predict AML subtypes using a logistic regression model with Lasso regularization

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call