Abstract

An important aspect in microarray data analysis is the selection of an appropriate number of the most relevant genes among a large population of genes. In this study, we have proposed a composite gene selection using both unsupervised and supervised gene selections. In the unsupervised gene selection, we used the threshold number of misclassification (TNoM) score to select an appropriate number of the top-ranked genes for microarray data analysis. In the supervised gene selection, the minimum number of genes showing the highest accuracy is obtained using the non-overlap area distribution measurement (NADM) method provided by the neural network with weighted fuzzy membership functions (NEWFM) from the top-ranked genes. In this study, from a colon cancer dataset and a leukemia dataset, we selected the top-ranked 93 colon cancer and 143 leukemia genes with ≤14 (colon cancer) and ≤13 (leukemia) TNoM scores from a total of 2000 colon cancer and 7129 leukemia genes. By the NADM method, a minimum of 4 colon cancer and 13 leukemia genes were selected from the top-ranked 93 colon cancer and 143 leukemia genes. When the minimal 4 colon cancer and 13 leukemia genes were used as inputs for the NEWFM, the performance accuracies were 98.39 % and 100 % for colon cancer and leukemia, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call