Abstract

BackgroundMore studies based on gene expression data have been reported in great detail, however, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA) and its modification methods for the classification of cancer based on gene expression data.MethodsThe classification performance of linear discriminant analysis (LDA) and its modification methods was evaluated by applying these methods to six public cancer gene expression datasets. These methods included linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), shrinkage centroid regularized discriminant analysis (SCRDA), shrinkage linear discriminant analysis (SLDA) and shrinkage diagonal discriminant analysis (SDDA). The procedures were performed by software R 2.80.ResultsPAM picked out fewer feature genes than other methods from most datasets except from Brain dataset. For the two methods of shrinkage discriminant analysis, SLDA selected more genes than SDDA from most datasets except from 2-class lung cancer dataset. When comparing SLDA with SCRDA, SLDA selected more genes than SCRDA from 2-class lung cancer, SRBCT and Brain dataset, the result was opposite for the rest datasets. The average test error of LDA modification methods was lower than LDA method.ConclusionsThe classification performance of LDA modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods.

Highlights

  • More studies based on gene expression data have been reported in great detail, one major challenge for the methodologists is the choice of classification methods

  • The classification performance of linear discriminant analysis (LDA) modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods

  • Studies on the diagnosis of cancer based on gene expression data have been reported in great detail, one major challenge for the methodologists is the choice of classification methods

Read more

Summary

Introduction

More studies based on gene expression data have been reported in great detail, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA) and its modification methods for the classification of cancer based on gene expression data. Studies on the diagnosis of cancer based on gene expression data have been reported in great detail, one major challenge for the methodologists is the choice of classification methods. Proposals to solve this problem have utilized many innovations including the introduction of sophisticated algorithms for support vector machines [1] and the proposal of ensemble methods such as random forests [2]. The main purpose of this research was to describe the performance of LDA and its modification methods for the classification of cancer based on gene expression data

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call