Abstract

Publicly available gene expression datasets were analyzed to develop a chromophobe and oncocytoma related gene signature (COGS) to distinguish chRCC from RO. The datasets GSE11151, GSE19982, GSE2109, GSE8271 and GSE11024 were combined into a discovery dataset. The transcriptomic differences were identified with unsupervised learning in the discovery dataset (97.8% accuracy) with density based UMAP (DBU). The top 30 genes were identified by univariate gene expression analysis and ROC analysis, to create a gene signature called COGS. COGS, combined with DBU, was able to differentiate chRCC from RO in the discovery dataset with an accuracy of 97.8%. The classification accuracy of COGS was validated in an independent meta-dataset consisting of TCGA-KICH and GSE12090, where COGS could differentiate chRCC from RO with 100% accuracy. The differentially expressed genes were involved in carbohydrate metabolism, transcriptomic regulation by TP53, beta-catenin-dependent Wnt signaling, and cytokine (IL-4 and IL-13) signaling highly active in cancer cells. Using multiple datasets and machine learning, we constructed and validated COGS as a tool that can differentiate chRCC from RO and complement histology in routine clinical practice to distinguish these two tumors.

Highlights

  • Chromophobe renal cell carcinoma and oncocytoma (RO) are renal tumor types originating from alpha intercalated cells of the collecting ducts of the kidney [1,2] comprising 8–12% of all renal neoplasms [3,4,5,6]

  • We identified transcriptomic differences distinguishing Chromophobe renal cell carcinoma (chRCC) from RO in a meta-dataset combined from multiple studies from the Gene Expression Omnibus (GEO) and ArrayExpress

  • We developed a 30-gene chromophobe and oncocytoma related gene signature (COGS), and elucidated pathway differences between chRCC and RO

Read more

Summary

Introduction

Chromophobe renal cell carcinoma (chRCC) and oncocytoma (RO) are renal tumor types originating from alpha intercalated cells of the collecting ducts of the kidney [1,2] comprising 8–12% of all renal neoplasms [3,4,5,6]. Gross morphology and histological similarities between the two tumors often pose difficulties in the classification of needle biopsy samples, which are the primary method of diagnosis of renal cancer [7]. Medical imaging, such as CT-Scan or MRI, fails to differentiate these tumors due to their similarity in appearance [8]. RO diagnosis is assisted by an IHC stain of cytokeratin 7, S100A1 [11], and kidney-specific cadherins [12]; the overlap between these markers in chRCC and RO makes it an ineffective method to distinguish these tumors [9,10,11]. We implemented unsupervised machine learning (ML) algorithms and validated ML models to distinguish chRCC from RO

Dataset Search and Selection
Data Preprocessing and Probe-to-Gene Conversion
Unsupervised Learning Pipeline
Differential Expression and Network Analysis and Immune Cell Infiltration
Validation Set Development and Signature Validation
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call