Mining Datasets for Molecular Subtyping in Cancer

Sally Yepes,Maria Mercedes Torres

doi:10.4172/2153-0602.1000185

Abstract

Given the heterogeneity in the clinical behavior of cancer patients with identical histopathological diagnosis, the search for unrecognized molecular subtypes, subtype-specific markers and the evaluation of their clinical-biological relevance are a necessity. This task is benefiting today from the high-throughput genomic technologies and free access to the datasets generated by the international genomic projects and the repositories of information. Machine learning strategies have proven to be useful in the identification of hidden trends in large datasets, contributing to the understanding of the molecular mechanisms and subtyping of cancer. However, the translation of new molecular subclasses and biomarkers into clinical settings requires their analytic validation and clinical trials to determine their clinical utility. Here, we provide an overview of the workflow to identify and confirm cancer subtypes, summarize a variety of methodological principles, and highlight representative studies. The generation of public big data on the most common malignancies is turning the molecular pathology into a database-driven discipline.

Highlights

The diagnosis of cancer is made primarily through histopathological classification systems that take into account the morphological characteristics of the tumor, allowing their identification and clinical stage assignment
It is necessary to identify patterns in large datasets and at a genomewide scale using machine learning strategies. This task benefits from the high-throughput genomic technologies, the enormous amount of genomic datasets generated by the international genomic projects, and the availability of data analysis algorithms, allowing a comprehensive and unprecedented characterization of the disease
In the case of microarray data, raw data are pre-processed in a process that involves three steps: background correction to adjust the intensity readings for nonspecific signals; adjustment of the intensity readings for technical variability to ensure that the measurements of all samples are comparable; and computation of a summary value for the different probes representing each gene

Summary

Introduction

The diagnosis of cancer is made primarily through histopathological classification systems that take into account the morphological characteristics of the tumor, allowing their identification and clinical stage assignment. The existing histopathological subtypes are heterogeneous; this is evident at the levels of molecular pathogenesis, clinical course, and treatment responsiveness [1,2]. The machine learning approaches can be used to dissect the complexity of cancer These are the computational tools that recognize and classify patterns based on models derived from the data. Machine learning for cancer subtyping has been performed mainly with expression data This technique can be applied to other levels of biological information, such as promoter methylation, miRNAs, and single nucleotide polymorphisms, analyzed with hybridization array technology or generation sequencing, allowing the study of the data structure in many different levels and providing an integrated view of the biological processes involved

Unsupervised and Supervised Learning for Cancer Study

Classification method Linear discriminant analysis

DLBCL Breast cancer

Subtypes with molecular heterogeneity

Clustering of patients

Classification and validation of results

Datasets and Analysis Tools

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Data Mining in Genomics & Proteomics	Publication Date: Jan 1, 2016
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Mining Datasets for Molecular Subtyping in Cancer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Mining in Genomics & Proteomics

Lead the way for us

Similar Papers

Molecular subclasses of breast cancer: how do we define them? The IMPAKT 2012 Working Group Statement
S Guiu ... J.S Reis-Filho
Annals of Oncology | VOL. 23
S Guiu, et. al.S Guiu ... J.S Reis-Filho
01 Dec 2012
Molecular subclasses of breast cancer: how do we define them? The IMPAKT 2012 Working Group Statement
S Guiu ... J.S Reis-Filho

Molecular subtypes of leiomyosarcoma: Moving toward a consensus
Jessica Burns ... Paul H Huang
Clinical and Translational Discovery | VOL. 2
Jessica Burns, et. al.Jessica Burns ... Paul H Huang
02 Nov 2022
Clinical and Translational Discovery | VOL. 2

Abstract P6-09-10: Results of multigene assay (MammaPrint®) and molecular subtyping (BluePrint®) substantially impact treatment decision making in early breast cancer: Final analysis of the WSG PRIME decision impact study
...
Cancer Research | VOL. 77
, et. al. ...
14 Feb 2017
Cancer Research | VOL. 77

Feasibility and Impact of Immunohistochemistry-based Molecular Subtyping for Muscle-invasive Bladder Cancer in Patients Treated with Radiation-based Therapy
Charles Hesswani ... Wassim Kassouf
European Urology Open Science | VOL. 57
Charles Hesswani, et. al.Charles Hesswani ... Wassim Kassouf
26 Sep 2023
European Urology Open Science | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining Datasets for Molecular Subtyping in Cancer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Mining in Genomics &amp; Proteomics

More From: Journal of Data Mining in Genomics & Proteomics