Low-dimensional Representation Of Data Research Articles

<h3>Purpose/Objective(s)</h3> Spatial dose distribution plays an important role in radiation treatment planning. Dose data is expressed in the voxelated grid representing the patient volume, and depending on the case, the total number of voxels can be of the order of 10^6–10^8. A compact representation of 3D dose data is of significance in facilitating treatment planning and downstream applications. This work aims to develop a technique to compress the 3D dose data using neural representation and demonstrate its potential in facilitating radiation therapy dose calculation and treatment planning. <h3>Materials/Methods</h3> In contrast to storing the dose values at each voxel, we propose using the weights of a multi-layer perceptron (MLP) to represent the dose data implicitly. We train a coordinate-based MLP with sinusoidal activations to map the voxel spatial coordinates to their dose values. We first identify the best architecture for a given parameter budget and use that to train a model for each patient in our dataset. The trained model is queried at each coordinate to reconstruct the 3D dose distribution at inference. We systematically evaluate the quality of the proposed representation by performing experiments on dose distributions of varying complexity from different disease sites. <h3>Results</h3> In our experiments, we generate implicit neural representations for 3D dose distributions of prostate, spine, and head and neck tumor cases. The learned representations achieve a peak signal-to-noise ratio greater than 50 dB and a compression ratio of ∼32, at a target bitrate of ∼1 for dose data from all three sites. The number of parameters in the trained network is less than 4% of the average number of voxels in dose data. Our results also show that model sizes with a bit rate of 1–2 are optimal for the task, and performance drops significantly for bitrates smaller than that. <h3>Conclusion</h3> We show how to learn a low-dimensional implicit neural representation of 3D dose data methodically and accurately. The learned representation is a continuous function and can accurately model the high-frequency information in the dose data. The continuous nature of the representation allows us to sample the dose distribution at arbitrary spatial resolutions. This study lays the groundwork for future applications of neural representations of dose data in radiation oncology.

Abstract Intro: Minor variations in cancer type can have a major impact on therapeutic effectiveness and on the course of drug research and development. In order to improve upon existing -omic data classification methods, this study seeks to correctly classify previously unknown cancer samples and to determine -omic overlap between cancers by clustering patient RNA-seq data using PCA+tSNE for dimensionality reduction. Background: Although there have been major innovations in the use of bulk RNA-seq data for in silico modeling, RNA-seq analysis has limitations. RNA-seq data has over 20,000 dimensions, making it difficult to perform clustering and differential analysis important for in silico modeling. Thus, dimensionality reduction is a key for RNA-seq analysis. A novel method typically used in single-cell RNA-seq analysis combines PCA with tSNE. A common argument against the use of tSNE beyond data visualization is that the results are inconsistent and can vary depending on the hyperparameters of tSNE. This work demonstrates that combining PCA with tSNE creates a robust dimensional reduction of bulk RNA-seq data that allows for more accurate clustering of patient samples. Methods: Using the TCGA dataset with over 38 cancer subtypes and 10,351 total samples, the dimension was reduced to 50 principal components using PCA. TSNE was then applied with 1000-2000 iterations, a learning rate of 200, and varying perplexity of 5-50. Robustness was measured by how cancers of the same subtype are grouped in the lower dimension, in this case 2D. Agglomerative and k-means clustering algorithms were applied to the dimensional reduction results. This was done to track the accuracy of clustering on cancer subtypes and the movement of samples between these clusters. Results: For pure tSNE without PCA pre-reduction, clustering was not robust and showed less distinction between cancer subtypes and an increased number of outliers. Clustering accuracy was 50-60% for both k-means and agglomerative models and all varied hyperparameters. PCA+tSNE produced the most robust results with 60-70% accuracy for both k-means and agglomerative and all varied hyperparameters. It was observed that a low perplexity of 5-20, a learning rate of 100-300, and higher iterations (&gt;1,000) produced the best results. Conclusions: PCA+tSNE is able to create accurate low dimensional representations of patient RNA-seq data that can be used to determine similarities and differences between patient samples based on gene expression data. The 10% variance in clustering accuracy suggests the method is robust and may help inform research and treatment by more robustly classifying cancer samples and by identifying similarities between cancers, especially among underserved, misdiagnosed, and rare cancers. Citation Format: Michael Bocker, Mikhail G. Grushko, Katherine E. Arline. Toward improved cancer classification using PCA + tSNE dimensionality reduction on bulk RNA-seq data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 2708.

Low-dimensional Representation Of Data Research Articles

Related Topics

Articles published on Low-dimensional Representation Of Data

Multiple Graph Adaptive Regularized Semi-Supervised Nonnegative Matrix Factorization with Sparse Constraint for Data Representation

Spatially aware dimension reduction for spatial transcriptomics

Hybrid Kronecker Product Decomposition and Approximation

Analysis of UMAP, the method for reducing the dimensionality of initial data in machine learning for the purpose of failure prediction in a motive power service

Time-series image denoising of pressure-sensitive paint data by projected multivariate singular spectrum analysis

Self-representative kernel concept factorization

Neural Representation for Three-Dimensional Dose Distribution and its Applications in Precision Radiation Therapy

HyperNTF: A hypergraph regularized nonnegative tensor factorization for dimensionality reduction

Local Learning-based Multi-task Clustering

Manifold-informed state vector subset for reduced-order modeling

Structured graph optimization for joint spectral embedding and clustering

Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data.

Abstract 2708: Toward improved cancer classification using PCA + tSNE dimensionality reduction on bulk RNA-seq data

Implicit neural representation for radiation therapy dose distribution

Predicting waves in fluids with deep neural network

A Multi-View Learning based Clustering Method for Health Care System

A survey of structural representation learning for social networks

GraphVAMPNet, using graph neural networks and variational approach to Markov processes for dynamical modeling of biomolecules.

Deep Isometric Maps

Dimension reduced turbulent flow data from deep vector quantisers

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Low-dimensional Representation Of Data Research Articles

Related Topics

Articles published on Low-dimensional Representation Of Data

Multiple Graph Adaptive Regularized Semi-Supervised Nonnegative Matrix Factorization with Sparse Constraint for Data Representation

Spatially aware dimension reduction for spatial transcriptomics

Hybrid Kronecker Product Decomposition and Approximation

Analysis of UMAP, the method for reducing the dimensionality of initial data in machine learning for the purpose of failure prediction in a motive power service

Time-series image denoising of pressure-sensitive paint data by projected multivariate singular spectrum analysis

Self-representative kernel concept factorization

Neural Representation for Three-Dimensional Dose Distribution and its Applications in Precision Radiation Therapy

HyperNTF: A hypergraph regularized nonnegative tensor factorization for dimensionality reduction

Local Learning-based Multi-task Clustering

Manifold-informed state vector subset for reduced-order modeling

Structured graph optimization for joint spectral embedding and clustering

Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data.

Abstract 2708: Toward improved cancer classification using PCA + tSNE dimensionality reduction on bulk RNA-seq data

Implicit neural representation for radiation therapy dose distribution

Predicting waves in fluids with deep neural network

A Multi-View Learning based Clustering Method for Health Care System

A survey of structural representation learning for social networks

GraphVAMPNet, using graph neural networks and variational approach to Markov processes for dynamical modeling of biomolecules.

Deep Isometric Maps

Dimension reduced turbulent flow data from deep vector quantisers