Mutation Clusters from Cancer Exome.

Zura Kakushadze,Willie Yu

doi:10.3390/genes8080201

Abstract

We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development.

Highlights

Introduction and SummaryUnless humanity finds a cure, about a billion people alive today will die of cancer
Apart from applying *K-means to exome data, we perform out-of-sample stability analysis of our results here.) We use data consisting of 10,656 published exome samples aggregated by 32 cancer types listed in Table 1, which summarizes total occurrence counts, numbers of samples and data sources
We use: iter.max = 100 (this is the maximum number of iterations used in the built-in R function kmeans(); we note that there was not a single instance in our 30 million runs of kmeans() where more iterations were required – the R function kmeans() produces a warning if it does not converge within iter.max); num.try = 1000; and num.runs = 30,000

Summary

Introduction and Summary

Unless humanity finds a cure, about a billion people alive today will die of cancer. Unlike other diseases, cancer occurs at the DNA level via somatic alterations in the genome. Considering that various signatures may be somatic mutational noise artifacts in the first instance and statistical error bars are large, it is natural to wonder whether there are some robust underlying clustering structures present in the data, with the understanding that such structures may not be present for all cancer types. Even if they are present for a substantial number of cancer types, unveiling them would amount to a major step forward in understanding cancer signature structure. We discuss how the input data (i.e., matrices of somatic mutation counts for cancer exome) are used in the context of *K-means in Section 3.2 (see [16] for technical details of *K-means)

Data Summary

Structure of the Data

Exome Data Results

Within-Cluster Correlations

Overall Correlations

Interpretation

Concluding Remarks

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genes	Publication Date: Aug 15, 2017
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Mutation Clusters from Cancer Exome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes

Lead the way for us

Similar Papers

Mutation Clusters from Cancer Exome
Zura Kakushadze ... Willie Yu
SSRN | VOL. -
Zura Kakushadze, et. al.Zura Kakushadze ... Willie Yu
01 Jan 2017
SSRN | VOL. -

Initialization of nonnegative matrix factorization by Gaussian primaries for reconstruction of spectral data
Syamak Farajikhah ... Seyed Hossein Amirshahi
Optical Review | VOL. 19
Syamak Farajikhah, et. al.Syamak Farajikhah ... Seyed Hossein Amirshahi
01 Sep 2012
Optical Review | VOL. 19

Measurement and Analysis of the Functional Independence Measure Data by Using Nonnegative Matrix Factorization Method
Naoki Yamamoto ... Nozomi Hayashida
Advanced materials research | VOL. 718-720
Naoki Yamamoto, et. al.Naoki Yamamoto ... Nozomi Hayashida
01 Jul 2013
Advanced materials research | VOL. 718-720

Efficient model selection for speech enhancement using a deflation method for Nonnegative Matrix Factorization
Minje Kim ... Paris Smaragdis
-
Minje Kim, et. al.Minje Kim ... Paris Smaragdis
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mutation Clusters from Cancer Exome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes