Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes.

Patrick Warnat,Benedikt Brors,Roland Eils

doi:10.1186/1471-2105-6-265

Abstract

BackgroundThe extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods.ResultsIn contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis.ConclusionCross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.

Highlights

The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies
We investigated six publicly available cancer microarray gene expression data sets to perform cross-platform supervised classification analysis
Each pair was chosen to consist of one study using cDNA arrays and one study based on oligonucleotide arrays

Summary

Introduction

The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. With an increasing number of microarray data becoming available, the comparison of studies with similar research goals, e.g. to identify genes being differentially expressed in normal versus tumour tissue, has gained high importance. Some studies propose methods for meta-analysis of microarray data with the goal to identify significantly differentially expressed genes across studies by using statistical techniques that avoid the direct comparison of gene expression values [8,9,10,11,12,13,14]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Nov 4, 2005
Citations: 264	License type: cc-by

R Discovery Prime

R Discovery Prime

Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Hybrid feature selection model based on relief‐based algorithms and regulizer algorithms for cancer classification
Ibrahim I.M Manhrawy ... Passent El‐Kafrawy
Concurrency and Computation: Practice and Experience | VOL. 33
Ibrahim I.M Manhrawy, et. al.Ibrahim I.M Manhrawy ... Passent El‐Kafrawy
28 Jan 2021
Concurrency and Computation: Practice and Experience | VOL. 33

A Comparative Study of Statistical and Artificial Intelligence based Classification Algorithms on Central Nervous System Cancer Microarray Gene Expression Data
Mustafa Turan Arslan
International Journal of Intelligent Systems and Applications in Engineering | VOL. 4
Mustafa Turan ArslanMustafa Turan Arslan
26 Dec 2016
International Journal of Intelligent Systems and Applications in Engineering | VOL. 4

L1TD1 - a prognostic marker for colon cancer
Deepankar Chakroborty ... Ari Ristimäki
BMC Cancer | VOL. 19
Deepankar Chakroborty, et. al.Deepankar Chakroborty ... Ari Ristimäki
23 Jul 2019
BMC Cancer | VOL. 19

Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms
Changkyoo Yoo ... Krist V Gernaey
JOURNAL OF CHEMICAL ENGINEERING OF JAPAN | VOL. 41
Changkyoo Yoo, et. al.Changkyoo Yoo ... Krist V Gernaey
01 Jan 2008
JOURNAL OF CHEMICAL ENGINEERING OF JAPAN | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics