Abstract

Microarray technology provides a way to monitor thousands of gene expressions at the same time. However, microarray data has a high dimensionality. This high dimensionality will affect the classification performance. In order to solve this issue, this research proposed the use of Minimum Redundancy Maximum Relevance (MRMR) as the dimension reduction method and Support Vector Machine (SVM) as the classifier. Principal Component Analysis (PCA) method was also used as a comparison to MRMR. Tests on lung cancer and ovarian cancer data with MRMR and SVM linear kernel classifier as well as polynomial kernel resulted in an F1-score of 1, with the number of features used for the classification was 20% of the original feature dataset. This means that the accuracy of the classification was 100% and the system built has an excellent performance. As for the colon cancer classification, the F1-score result using MRMR and SVM polynomial kernel classifier was greater than the classification without the dimension reduction method, which is 0.84. It is the same with the classification of leukemia cancer, where the MRMR and SVM polynomial kernel classifier obtained greater result than the result of leukemia classification without dimension reduction method, which the F1-score was 0.9657.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call