Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?

Hua-Ping Liu,Dongwen Wang,Hung-Ming Lai

doi:10.1016/j.csbj.2022.05.035

Abstract

There is a growing need to build a model that uses single cell RNA-seq (scRNA-seq) to separate malignant cells from nonmalignant cells and to identify tumor of origin of single cells and/or circulating tumor cells (CTCs). Currently, it is infeasible to build a tumor of origin model learnt from scRNA-seq by machine learning (ML). We then wondered if an ML model learnt from bulk transcriptomes is applicable to scRNA-seq to infer single cells’ tumor presence and further indicate their tumor of origin. We used k-nearest neighbors, one-versus-all support vector machine, one-versus-one support vector machine, random forest and introduced scTumorTrace to conduct a pioneering experiment containing leukocytes and seven major cancer types where bulk RNA-seq and scRNA-seq data were available. 13 ML models learnt from bulk RNA-seq were all reliable to use (F-score > 96%) shown by a validation set of bulk transcriptomes, but none of them was applicable to scRNA-seq except scTumorTrace. Making inferences from bulk RNA-seq to scRNA-seq was impaired by feature selection and improved by log2-transformed TPM units. scTumorTrace with transcriptome-wide 2-tuples showed F-score beyond 98.74 and 94.29% in inferring tumor presence and tumor of origin at single-cell resolution and correctly identified 45 single candidate prostate CTCs but lineage-confirmed non-CTCs as leukocytes. We concluded that modern ML techniques are quantitative and could hardly address the raised questions. scTumorTrace with transcriptome-wide 2-tuples is qualitative, standardization-free and not subject to log2-transformed quantities, enabling us to infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational and Structural Biotechnology Journal	Publication Date: Jan 1, 2022
Citations: 4	License type: cc-by-nc-nd

R Discovery Prime

Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?

Abstract

Published Version

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal

Lead the way for us

Similar Papers

Molecular profiling of single circulating tumor cells with diagnostic intention.
Bernhard Polzer ...
EMBO Molecular Medicine | VOL. 6
Bernhard Polzer, et. al.Bernhard Polzer ...
30 Oct 2014
EMBO Molecular Medicine | VOL. 6

Abstract NG04: Diversity of circulating tumor cells in a mouse pancreatic cancer model identified by single cell RNA sequencing
...
Cancer Research | VOL. 74
, et. al. ...
30 Sep 2014
Cancer Research | VOL. 74

Abstract 1717: Orthogonal identification of circulating tumor cells (CTCs) using single cell low pass whole-genome sequencing (WGS) and copy-number alteration (CNA) analysis
Gareth Morrison ... Aditi Khurana
Cancer Research | VOL. 77
Gareth Morrison, et. al.Gareth Morrison ... Aditi Khurana
01 Jul 2017
Cancer Research | VOL. 77

Intra-tumor hypoxia drives the formation of breast cancer circulating tumor cell clusters

-

01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?

Abstract

Published Version

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal