Abstract
DNA microarray datasets have extensive number of genes however just a little number of qualities are required to distinguish a specific kind of disease. There are some issues in microarray data, the curse of dimensionality and the number of irrelevant features present which can be overcome by features extraction and dimensionality reduction technique. So gene selection assumes a vital part in expelling unessential elements which enhances exactness. Accurate prediction of disease is a key to examine patients for prognosis and treatment. One key method for gene expression analysis is clustering. Cluster analysis is preferred for the comprehension of expression level of multiple genes simultaneously through a microarray data. In this thesis I have discussed about feature extraction procedures including I-Γrelief, Principal component analysis (PCA). Microarray datasets have the issue of dimensionality. Principal component analysis procedure can be very effective used to decrease dimensions. After feature selection clustering is performed, Distinctive sorts of clustering algorithms like k-means, hierarchical, k-mediods and DBSCAN are applied on datasets GSE2226 and GSE18229 and the results of different clustering algorithms have been discussed.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have