Abstract

Abstract Single nucleotide polymorphisms (SNP) play a vital role in genome analysis and have potential benefit in study of carcinogenesis and cancer drug therapy. As the dimension of data is extremely large and shows large numbers of SNPs, representation as feature elements with respect to cancer subject dataset. This research shows SNPs associations with cancer growth. Single nucleotide polymorphisms (SNPs) along with DNA microarrays provide the genetic variability in the study. As the data acquisition collects millions of data elements within genome and provides voluminous genetic information, arrays are best in the exploration of molecular medicine. Thus there is a major requirement of efficient mathematical modeling to conduct genome-wide pattern searches for SNPs associations with phenotype. Objective: To analyze high dimensional datasets of oncology to detect cancer, cancer prognosis, identifying gene expression that is responsible for tumor formation and assess the potential of these machine learning algorithms in identifying cancer. Methods: Oncology high dimensional data sets were fed into the database. Subjects investigated were primarily from the brain tumor family, such as •Acoustic Neuroma, Astrocytoma: ∘Grade I - Pilocytic Astrocytoma ∘Grade II - Low-grade Astrocytoma ∘Grade III - Anaplastic Astrocytoma ∘Grade IV - Glioblastoma (GBM) •Chordoma •CNS Lymphoma •Craniopharyngioma •Other Gliomas: ∘Brain Stem Glioma ∘Ependymoma ∘Mixed Glioma ∘Optic Nerve Glioma ∘Subependymoma •Medulloblastoma •Meningioma •Metastatic Brain Tumors •Oligodendroglioma •Pituitary Tumors •Primitive Neuroectodermal (PNET) •Other Brain-Related Conditions •Schwannoma •Brain Stem Glioma •Craniopharyngioma •Ependymoma •Juvenile Pilocytic Astrocytoma (JPA) •Medulloblastoma •Optic Nerve Glioma •Pineal Tumor •Primitive Neuroectodermal Tumors (PNET) •Rhabdoid Tumor. A mathematical framework is proposed that runs machine-learning algorithms to predict and forecast disease and assess the gene profiling. Proposed Wavelet Transform is able to hold a large amount of data and reduce data dimensions. Feature selection and feature extraction provides the relevant information that is required for the cancer detection and mapping with SNPs. Results: Identified were the following: TP53 mutations, PDGF/PDGFR expression, EGFR/MDM2 amplification, CDKN2A and PTEN mutation and deletion for precursor cells to Glioblastoma IV and Anaplastic Oligodendroglioma IV with the 10 - 19 p loss. Conclusions: Proposed mathematical framework and machine learning algorithm is extremely fast for the computation of large volume of data and provides an approximation of cancer recurrence Note: This abstract was not presented at the conference because the presenter was unable to attend. Citation Format: Akash Singh. High-dimensional data analysis in oncology. [abstract]. In: Proceedings of the AACR Special Conference on Post-GWAS Horizons in Molecular Epidemiology: Digging Deeper into the Environment; 2012 Nov 11-14; Hollywood, FL. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2012;21(11 Suppl):Abstract nr 01.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call