Analysis of complexity indices for classification problems: Cancer gene expression data

Ana C Lorena,Ivan G Costa,Newton Spolaôr,Marcilio C.P De Souto

doi:10.1016/j.neucom.2011.03.054

Abstract

Currently, cancer diagnosis at a molecular level has been made possible through the analysis of gene expression data. More specifically, one usually uses machine learning (ML) techniques to build, from cancer gene expression data, automatic diagnosis models (classifiers). Cancer gene expression data often present some characteristics that can have a negative impact in the generalization ability of the classifiers generated. Some of these properties are data sparsity and an unbalanced class distribution. We investigate the results of a set of indices able to extract the intrinsic complexity information from the data. Such measures can be used to analyze, among other things, which particular characteristics of cancer gene expression data mostly impact the prediction ability of support vector machine classifiers. In this context, we also show that, by applying a proper feature selection procedure to the data, one can reduce the influence of those characteristics in the error rates of the classifiers induced.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis of complexity indices for classification problems: Cancer gene expression data

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Jul 31, 2011
Citations: 50

Similar Papers

Comparative Analysis of Different Label-Free Mass Spectrometry Based Protein Abundance Estimates and Their Correlation with RNA-Seq Gene Expression Data
Kang Ning ... Damian Fermin
Journal of Proteome Research | VOL. 11
Kang Ning, et. al.Kang Ning ... Damian Fermin
29 Feb 2012
Journal of Proteome Research | VOL. 11

Complexity measures of supervised classifications tasks: A case study for cancer gene expression data
Marcilio C P De Souto ... Newton Spolaor
-
Marcilio C P De Souto, et. al.Marcilio C P De Souto ... Newton Spolaor
01 Jul 2010
01 Jul 2010

Abstract 4283: Cross-species hybridization of microarrays for studying tumor transcriptome of brain metastasis
Eun Sung Park ... Ju-Seog Lee
Cancer Research | VOL. 72
Eun Sung Park, et. al.Eun Sung Park ... Ju-Seog Lee
15 Apr 2012
Abstract 4283: Cross-species hybridization of microarrays for studying tumor transcriptome of brain metastasis
Eun Sung Park ... Ju-Seog Lee

Cross-species hybridization of microarrays for studying tumor transcriptome of brain metastasis
Eun Sung Park ... Hyun Goo Woo
Proceedings of the National Academy of Sciences | VOL. 108
Eun Sung Park, et. al.Eun Sung Park ... Hyun Goo Woo
10 Oct 2011
Proceedings of the National Academy of Sciences | VOL. 108

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of complexity indices for classification problems: Cancer gene expression data

Abstract

Talk to us

Similar Papers

More From: Neurocomputing