Classification and Performance Analysis of Cancer Microarrays Using Relevant Genes

Tasniah Mohiuddin,Paramita Basak Upama,Shamima Naznin

doi:10.1109/iceeict53905.2021.9667822

Abstract

Cancer, being one of the deadly diseases, is increasing its number of cases every year. A recent popular cancer identification study is carried out with microarray gene data. This type of data can be used to observe gene expression in cells, which helps to analyze several thousands of genes at a time. Analysis of such gene expression helps in cancer identification and classification. It facilitates selection of proper treatments and drug developments. Gene expression data sets for ovarian, leukemia and central nervous system (CNS) cancer have been analyzed in this research using several popular ML and data mining techniques such as Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), Random Forest (RF) and K-Nearest Neighbors (kNN) algorithms after we could find out the most relevant set of gene using feature selection techniques- Genetic Search Algorithm (GA), Evolutionary Algorithm (EA) and Multi-objective Evolutionary Algorithm (MOEA). The ultimate goal of this work has been to discover the minimal set of features for a classification model without detrimentation the classification accuracy. In this work, MOEA and SVM together provide the best outcome with maximum accuracy.

Full Text