Abstract
AbstractFeature extraction plays an important role to improve the performance of the classifier. Microarray consists of a large amount of features with small number of samples. In this paper, we address the dimension reduction of DNA features in which relevant features are extracted among thousands of irrelevant ones through dimensionality reduction. This enhances the speed and accuracy of the classifiers. Principal component analysis (PCA) is a very powerful statistical technique to represent the d-dimensional data in a lower-dimensional space without any significant loss of information. The aim is to project the original I-dimensional space into an \( I_{0} \)-dimensional linear subspace, where \( I > I_{0} \) such that the variance in the data is maximally explained within the smaller \( I_{0} \)-dimensional space to solve the curse of dimensionality problem (where number of features are large with less samples). Support vector machine (SVM) is implemented, and its performance is measured in terms of predictive accuracy, specificity, and sensitivity. First, we implement PCA for significant feature extraction and then SVM to train the reduced feature set. In the second part, we attempt to validate our results on two public data sets (ovarian and colon). KeywordsCancer classificationFeature extractionPrincipal component analysisSupport vector machine
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have