Abstract

AbstractIn recent studies of medical field especially, it is essential to assess the expression levels of genes using the microarray technology. Most of the medical diseases like breast cancer, lung cancer, and recent corona are estimated using the gene expressions. The study in this paper focused on performing both classification and feature selection on different microarray data. The gene expression data is high dimensional and extraction of optimal genes in microarray data is challenging task. The feature selection methods Recursive Feature Elimination (RFE), Relief, LASSO (Least Absolute Shrinkage And Selection Operator) and Ridge were initially applied to extract optimal genes in microarray data. Later, applied a good number of multi classification methods which includes K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Multilayer Perceptron Networks (MLP), Random Forest (RF) and Logistic Regression (LR). But the combination of mentioned feature selection and classifications required high computation. However, resampling method (i.e., SMOTE = Synthetic Minority Oversampling Technique) prior to the feature selection which enhances the microarray data analysis in classification respectively. The resampling method, with combination of RFE and LASSO feature selection using SVM and LR classification outperforms compared to other methods.KeywordsMicroarray dataGene expressionK-Nearest NeighborsRandom ForestRecursive Feature EliminationSupport vector machinesLASSORidgeMultilayer Perceptron NetworksLogistic Regression

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.