Abstract

In bioinformatics research, cancer classification is a crucial domain. The use of microarray technology to identify specific illnesses is common. A small number of genes uncovered in clinical applications can lead to low-cost medicines that can help estimate a patient's survival time or diagnose cancer. Because there are more genes and fewer samples in microarray data, high dimensionality is a serious concern. The genes in the microarray data were evaluated using F-statistics, T-Statistics, and Signal-to-Noise Ratio (SNR) in this study. The top-m rated genes are analyzed using optimization approaches to retrieve useful information. The genetic algorithm (GA), particle swarm optimization (PSO), cuckoo search (CS), and shuffling frog leaping with rapid flying are among the methods employed (SFLLF). Classification is done using the Support vector machine (SVM), the K-Nearest Neighbor classifier (KNN), and the Naive Bayes classifier (NBC). Lung Cancer Michigan, AMLALL, Colon Tumour, Lung Harvard2, and others are among the datasets utilized for experimental analysis. The classifiers are assessed using a 5-fold cross-validation approach. The findings demonstrate that the suggested two-step feature selection approaches are effective in selecting relevant genes from microarray data for cancer classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call