A Weakly Supervised Clustering Method for Cancer Subgroup Identification

Duygu Ozceli̇k,Öznur Taştan

doi:10.17694/bajece.1033807

Abstract

Identifying subgroups of cancer patients is important as it opens up possibilities for targeted therapeutics. A widely applied approach is to group patients with unsupervised clustering techniques based on molecular data of tumor samples. The patient clusters are found to be of interest if they can be associated with a clinical outcome variable such as the survival of patients. However, these clinical variables of interest do not participate in the clustering decisions. We propose an approach, WSURFC (Weakly Supervised Random Forest Clustering), where the clustering process is weakly supervised with a clinical variable of interest. The supervision step is handled by learning a similarity metric with features that are selected to predict this clinical variable. More specifically, WSURFC involves a random forest classifier-training step to predict the clinical variable, in this case, the survival class. Subsequently, the internal nodes are used to derive a random forest similarity metric among the pairs of samples. In this way, the clustering step utilizes the nonlinear subspace of the original features learned in the classification step. We first demonstrate WSURFC on hand-written digit datasets, where WSURFC is able to capture salient structural similarities of digit pairs. Next, we apply WSURFC to find breast cancer subtypes using mRNA, protein, and microRNA expressions as features. Our results on breast cancer show that WSURFC could identify interesting patient subgroups more effectively than the widely adopted methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Weakly Supervised Clustering Method for Cancer Subgroup Identification

Abstract

Talk to us

Similar Papers

More From: Balkan Journal of Electrical and Computer Engineering

Lead the way for us

Journal: Balkan Journal of Electrical and Computer Engineering	Publication Date: Apr 30, 2022
License type: cc-by

Similar Papers

Seeing the Forest for the Trees: Random Forest Models for Predicting Survival in Kidney Transplant Recipients.
Ruth Sapir-Pichhadze ... Bruce Kaplan
Transplantation | VOL. 104
Ruth Sapir-Pichhadze, et. al.Ruth Sapir-Pichhadze ... Bruce Kaplan
01 May 2020
Transplantation | VOL. 104

Strategies for Mining Metagenomic Markers of the Gestational Diabetes Mellitus Microbiome
Paul W Bible
The FASEB Journal | VOL. 34
Paul W BiblePaul W Bible
01 Apr 2020
The FASEB Journal | VOL. 34

Abstract PO3-06-05: Association of tumor-derived extracellular vesicles with circulating tumor DNA alterations in metastatic breast cancer patients: exploring differences in inflammatory breast cancer
Eleonora Nicolò ... Massimo Cristofanilli
Cancer Research | VOL. 84
Eleonora Nicolò, et. al.Eleonora Nicolò ... Massimo Cristofanilli
02 May 2024
Cancer Research | VOL. 84

Abstract 317: Outcomes after Primary Prevention Implantable Cardioverter Defibrillator Placement: Results of the Cardiovascular Research Network Longitudinal Study of Implantable Cardioverter Defibrillators
Frederick A Masoudi ... Frances Fiocchi
Circulation: Cardiovascular Quality and Outcomes | VOL. 7
Frederick A Masoudi, et. al.Frederick A Masoudi ... Frances Fiocchi
01 Jul 2014
Circulation: Cardiovascular Quality and Outcomes | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Weakly Supervised Clustering Method for Cancer Subgroup Identification

Abstract

Talk to us

Similar Papers

More From: Balkan Journal of Electrical and Computer Engineering