Abstract

Alzheimer’s Disease (AD) is a neurological disorder that destroys memory and other significant mental functions. One of the most accurate methods to identify the disease-causing genes is to monitor gene expression values in various samples. Selecting significant genes for classification is important in gene expression studies. In this study, the experimental data are taken from the gene expression data of human brain in persons with AD and older control subjects GEO GSE5281 data set. In this work, a new two-step gene selection is applied to filter the noisy and redundant genes, based on the statistical method and heuristic optimization approach. T-statistic (T-test), Signal to Noise Ratio (SNR) and F-test, are used in the first step of the gene selection process. The top ten significant genes selected from the statistical methods are applied to Particle Swarm Optimization (PSO) to obtain the optimal number of features of Alzheimer’s disease. To avoid the stagnation issue in PSO, a modified PSO approach is proposed which finds a new particle position by utilizing the Genetic Algorithm (GA) crossover and mutation operators. The classifiers, Decision tree, Support Vector Machine (SVM), Linear Model, Random Forest and Neural network, are employed in training and testing data to analyse the performance of GA & PSOs. Modified PSO with t-Test in Random forest and Linear model provides 100% accuracy for the test dataset of GSE5281 with optimum number of genes. The significant genes identified through this research are EGR1, CKMT1B, RPL15, PSMB3, GRK4, COX6A1 and PHIP from the GSE5281 dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call