Tensor-Decomposition-Based Unsupervised Feature Extraction Applied to Prostate Cancer Multiomics Data

Y-H Taguchi,Turki Turki

doi:10.3390/genes11121493

Y-H Taguchi, Turki Turki

Open Access

https://doi.org/10.3390/genes11121493

Copy DOI

Journal: Genes	Publication Date: Dec 11, 2020
Citations: 3	License type: CC BY 4.0

Affiliation: Chuo University, King Abdulaziz University

Abstract

The large p small n problem is a challenge without a de facto standard method available to it. In this study, we propose a tensor-decomposition (TD)-based unsupervised feature extraction (FE) formalism applied to multiomics datasets, in which the number of features is more than 100,000 whereas the number of samples is as small as about 100, hence constituting a typical large p small n problem. The proposed TD-based unsupervised FE outperformed other conventional supervised feature selection methods, random forest, categorical regression (also known as analysis of variance, or ANOVA), penalized linear discriminant analysis, and two unsupervised methods, multiple non-negative matrix factorization and principal component analysis (PCA) based unsupervised FE when applied to synthetic datasets and four methods other than PCA based unsupervised FE when applied to multiomics datasets. The genes selected by TD-based unsupervised FE were enriched in genes known to be related to tissues and transcription factors measured. TD-based unsupervised FE was demonstrated to be not only the superior feature selection method but also the method that can select biologically reliable genes. To our knowledge, this is the first study in which TD-based unsupervised FE has been successfully applied to the integration of this variety of multiomics measurements.

Highlights

The term “big data” often indicates a high number of instances as well as features [1,2].Typical big data comprise a few million images each composed of several million pixels.The number of features is as high as the number of instances, or often higher
Taguchi has proposed a very different method to the typical machine-learning methods that are applicable to large p small n problems: tensor-decomposition (TD)-based unsupervised feature extraction (FE) [17]. m-mode tensor is associated with more than two suffix whereas matrix is associated with two suffix, row and column
As many as 1785 protein-coding genes can be counted in these regions, which is much higher than expected. This indicates that TD-based unsupervised FE can select genomic regions that include protein-coding genes, correctly considering the altered multiomics variables between normal and tumor tissues, non-coding RNAs have a key role in regulating the behavior of cells and their over- and underexpression strongly correlated with cancer

Summary

Introduction

The term “big data” often indicates a high number of instances as well as features [1,2]. Taguchi has proposed a very different method to the typical machine-learning methods that are applicable to large p small n problems: tensor-decomposition (TD)-based unsupervised feature extraction (FE) [17]. M-mode tensor is associated with more than two suffix whereas matrix is associated with two suffix, row and column In this method, a smaller number of representative features, referred to as singular value vectors, are generated with linear combinations of the original large number of features, without considering labeling. “multiple tissues”, since tensor is more reasonable format than matrix, TD based unsupervised FE is more suitable method than PCA based unsupervised FE

Synthetic Data

Real Dataset

Categorical Regression

PenalizeLDA

TD-Based Unsupervised FE

PCA Based Unsupervised FE

Real Data

Discussions Not to Specific to Either Synthetic or Real Data

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tensor-Decomposition-Based Unsupervised Feature Extraction Applied to Prostate Cancer Multiomics Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes

Lead the way for us

Similar Papers

Projection in genomic analysis: A theoretical basis to rationalize tensor decomposition and principal component analysis as feature selection tools
Turki Turki ... Chi-Hua Chen
-
Turki Turki, et. al.Turki Turki ... Chi-Hua Chen
29 Sep 2022
29 Sep 2022

Projection in genomic analysis: A theoretical basis to rationalize tensor decomposition and principal component analysis as feature selection tools.
Chi-Hua Chen ... Turki Turki
PloS one | VOL. 17
Chi-Hua Chen, et. al.Chi-Hua Chen ... Turki Turki
29 Sep 2022
PloS one | VOL. 17

Mathematical formulation and application of kernel tensor decomposition based unsupervised feature extraction
Y-H Taguchi ... Turki Turki
Knowledge-Based Systems | VOL. 217
Y-H Taguchi, et. al.Y-H Taguchi ... Turki Turki
06 Feb 2021
Knowledge-Based Systems | VOL. 217

Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Single-Cell Gene Expression Analysis.
Turki Turki ... Y-H Taguchi
Frontiers in genetics | VOL. 10
Turki Turki, et. al.Turki Turki ... Y-H Taguchi
19 Sep 2019
Frontiers in genetics | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tensor-Decomposition-Based Unsupervised Feature Extraction Applied to Prostate Cancer Multiomics Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes