Abstract

AbstractCancer subtype identification using integrative analysis of high-dimensional and heterogeneous multi-omics data has gained a lot of attention. Clustering analysis using data integration has become a desirable approach to obtain hidden substructure of the datasets reflecting the correlation between and within the data. In this paper, for integrative clustering of multi-omics data joint non-negative matrix factorization (jNMF) and sparse-jNMF has been adopted. The nature of NMF is iterative and is inherently non-convex, non-differentiable and multimodal, therefore, the initial point estimation of the NMF factor matrices to a great extent affects the quality of the solution. Metaheuristics optimized initialization of NMF is considered as a favorable choice. In this paper, high-dimensional GTO encoded structure (HD-GTO)-based initialization of jNMF and sparse-jNMF has been proposed. The experimental results are conducted on two multi-omics cancer datasets. It is observed that HD-GTO-guided initialization of sparse-jNMF shows improvement in accuracy and purity when compared with other state-of-the-art metaheuristics. Experimental results also confirm that HD-GTO sparse-jNMF produces 3.5% average improvement in accuracy and 4.1% average improvement in purity on two datasets when compared with jNMF.KeywordsMulti-omics dataData integrationMetaheuristicsNon-negative matrix factorizationCancer subtypes

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call