Abstract

Classification is a vital tool for understanding the relationships of living things using which similar things can be grouped together. Classification of elements into groups makes the study relatively easy. Therefore, classification is necessary to know salient features and characteristics of living organisms as well as their inter relationship among different group of organisms, as the correct classification of a person's disease is important for proper treatment. Support vector machine (SVM) was the first proposed kernel-based method, which uses a kernel function to transfer data from input space into high dimensional feature space; it searches for a separating hyper-plane. SVM is based on simple ideas which originated in statistical learning theory; hence the aim is to solve only the problem of interest without solving a more difficult problem as an intermediate step. SVM apply a simple linear method to the data but in a high-dimensional feature space non-linearly related to the input space. Even though we can think of SVM as a linear algorithm in high dimensional space, but in practice it does not involve any computations in that high-dimensional space. As dimensionality is curse to gene expression data set, in this paper Principal Component Analysis (PCA) is used for feature reduction to breast cancer, lung cancer and cardiotography data sets, and SVM is trained by linear, polynomial and radial basis function (RBF) kernels applied on each of these data sets and the comparison among them shows that RBF is better for the three data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call