ANMM4CBR: a case-based reasoning method for gene expression data classification.

Bangpeng Yao,Shao Li

doi:10.1186/1748-7188-5-14

Abstract

BackgroundAccurate classification of microarray data is critical for successful clinical diagnosis and treatment. The "curse of dimensionality" problem and noise in the data, however, undermines the performance of many algorithms.MethodIn order to obtain a robust classifier, a novel Additive Nonparametric Margin Maximum for Case-Based Reasoning (ANMM4CBR) method is proposed in this article. ANMM4CBR employs a case-based reasoning (CBR) method for classification. CBR is a suitable paradigm for microarray analysis, where the rules that define the domain knowledge are difficult to obtain because usually only a small number of training samples are available. Moreover, in order to select the most informative genes, we propose to perform feature selection via additively optimizing a nonparametric margin maximum criterion, which is defined based on gene pre-selection and sample clustering. Our feature selection method is very robust to noise in the data.ResultsThe effectiveness of our method is demonstrated on both simulated and real data sets. We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine (SVM) and k nearest neighbor (kNN), especially when the data contains a high level of noise.AvailabilityThe source code is attached as an additional file of this paper.

Highlights

Accurate classification of microarray data is critical for successful clinical diagnosis and treatment
We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine (SVM) and k nearest neighbor, especially when the data contains a high level of noise
We carried out experiments using simulated data as well as real microarray data to test the performance of ANMM4CBR

Summary

Introduction

Accurate classification of microarray data is critical for successful clinical diagnosis and treatment. Two typical problems that researches want to solve using microarray data are: (1) discovering informative genes for classification based on different cell-types or diseases [1]; (2) clustering and arranging genes according to their similarity in expression patterns [2]. Many commonly used classifiers are rule-based or statistical-based. One challenge of these methods on microarray data is the small sample size problem. With the limited number of training samples, it is difficult to obtain domain knowledge for rule-based systems or get accurate parameters (such as mean value and standard deviation) for statistical-based approaches

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms for Molecular Biology	Publication Date: Jan 6, 2010
Citations: 46	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

ANMM4CBR: a case-based reasoning method for gene expression data classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology

Lead the way for us

Similar Papers

Implementation of the CBR (Case Based Reasoning) Method in Cases of Cesarean section
I Wayan Supriana ... Ni Wayan Wiantari
JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) | VOL. 8
I Wayan Supriana, et. al.I Wayan Supriana ... Ni Wayan Wiantari
08 Jan 2020
JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) | VOL. 8

A Fixture Design Retrieving Method Based on Constrained Maximum Common Subgraph
Chen Luo ... Zhonghua Ni
IEEE Transactions on Automation Science and Engineering | VOL. 15
Chen Luo, et. al.Chen Luo ... Zhonghua Ni
01 Apr 2018
IEEE Transactions on Automation Science and Engineering | VOL. 15

Comparison of CBR and SVM Method Used in the Prediction of Land Use Change in Pearl River Delta, China
Yeran Sun ... Yunyan Du
-
Yeran Sun, et. al.Yeran Sun ... Yunyan Du
01 Oct 2010
01 Oct 2010

Differentiation of fat-poor angiomyolipoma from clear cell renal cell carcinoma in contrast-enhanced MDCT images using quantitative feature classification.
Han Sang Lee ... Helen Hong
Medical Physics | VOL. 44
Han Sang Lee, et. al.Han Sang Lee ... Helen Hong
09 Jun 2017
Medical Physics | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ANMM4CBR: a case-based reasoning method for gene expression data classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology