Abstract

Recently, feature extraction and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as genome data. In this paper, a new feature extraction method based on sparse singular value decomposition (SSVD) is developed. SSVD algorithm is applied to extract differentially expressed genes from two different genome datasets that are all from The Cancer Genome Atlas (TCGA), and then the extracted genes are evaluated by the tools based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. As a gene extraction method, SSVD is also compared with some existing feature extraction methods such as independent component analysis, the p-norm robust feature extraction and sparse principal component analysis. The experimental GO analysis results show that SSVD method outperforms the competitive algorithms. The KEGG analysis results demonstrate the genes which participate in the pathways in cancer. The elaborate experiments prove that SSVD is an effective feature selection method compared with the competitive methods. The KEGG analysis results may provide a meaningful reference to carry out further study for professionals in the field of biomedical science.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call