Abstract

BackgroundUsing DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. First, gene expression profiles are summarized by optimally selected Self-Organizing Maps (SOMs), followed by tumor sample classification by Fuzzy C-means clustering. Then, the prediction of marker genes is accomplished by either manual feature selection (visualizing the weighted/mean SOM component plane) or automatic feature selection (by pair-wise Fisher's linear discriminant).ResultsThe proposed models were tested on four published datasets: (1) Leukemia (2) Colon cancer (3) Brain tumors and (4) NCI cancer cell lines. The models gave class prediction with markedly reduced error rates compared to other class prediction approaches, and the importance of feature selection on microarray data analysis was also emphasized.ConclusionsOur models identify marker genes with predictive potential, often better than other available methods in the literature. The models are potentially useful for medical diagnostics and may reveal some insights into cancer classification. Additionally, we illustrated two limitations in tumor classification from microarray data related to the biology underlying the data, in terms of (1) the class size of data, and (2) the internal structure of classes. These limitations are not specific for the classification models used.

Highlights

  • Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction

  • We demonstrate the performance of the two suggested models using four microarray data sets: (1) leukaemia http://www-genome.wi.mit.edu/ cgi-bin/cancer/publications/pub_menu.cgi; (2) colon cancer http://microarray.princeton.edu/oncology/affy data/index.html; (3) brain tumors http://wwwgenome.wi.mit.edu/mpr/central nervous system (CNS)/; and (4) cancer cell lines from the NCI60 data set http://genome-www.stan ford.edu/nci60/

  • Leukemia data The data set used here is an acute leukemia data set published by Golub et al The original training data set consisted of 38 bone marrow samples, containing 27 acute lymphoblastic leukemias (ALL) and 11 acute myeloid leukemias (AML)

Read more

Summary

Introduction

Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. BMC Bioinformatics 2003, 4 http://www.biomedcentral.com/1471-2105/4/60 for this task, such as k-nearest neighbours, weighted voting [9], support vector machines [23], partial least squares [14], hierarchical clustering, artificial neural networks [12], and supervised clustering [5] Even if these approaches show promising results, classification of clinical samples remains a challenging task due to the complexity and high dimensionality of microarray gene expression data [6]. We propose two novel classification models: A combination of optimally selected Self-Organizing Maps (SOMs), followed by Fuzzy C-means clustering (FCC) and the use of pair-wise Fisher's linear discriminant (PFLD). A systematic learning of the internal structure of different tumor classes from microarray expression data has been carried out in this paper

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.