Abstract Intro: Minor variations in cancer type can have a major impact on therapeutic effectiveness and on the course of drug research and development. In order to improve upon existing -omic data classification methods, this study seeks to correctly classify previously unknown cancer samples and to determine -omic overlap between cancers by clustering patient RNA-seq data using PCA+tSNE for dimensionality reduction. Background: Although there have been major innovations in the use of bulk RNA-seq data for in silico modeling, RNA-seq analysis has limitations. RNA-seq data has over 20,000 dimensions, making it difficult to perform clustering and differential analysis important for in silico modeling. Thus, dimensionality reduction is a key for RNA-seq analysis. A novel method typically used in single-cell RNA-seq analysis combines PCA with tSNE. A common argument against the use of tSNE beyond data visualization is that the results are inconsistent and can vary depending on the hyperparameters of tSNE. This work demonstrates that combining PCA with tSNE creates a robust dimensional reduction of bulk RNA-seq data that allows for more accurate clustering of patient samples. Methods: Using the TCGA dataset with over 38 cancer subtypes and 10,351 total samples, the dimension was reduced to 50 principal components using PCA. TSNE was then applied with 1000-2000 iterations, a learning rate of 200, and varying perplexity of 5-50. Robustness was measured by how cancers of the same subtype are grouped in the lower dimension, in this case 2D. Agglomerative and k-means clustering algorithms were applied to the dimensional reduction results. This was done to track the accuracy of clustering on cancer subtypes and the movement of samples between these clusters. Results: For pure tSNE without PCA pre-reduction, clustering was not robust and showed less distinction between cancer subtypes and an increased number of outliers. Clustering accuracy was 50-60% for both k-means and agglomerative models and all varied hyperparameters. PCA+tSNE produced the most robust results with 60-70% accuracy for both k-means and agglomerative and all varied hyperparameters. It was observed that a low perplexity of 5-20, a learning rate of 100-300, and higher iterations (>1,000) produced the best results. Conclusions: PCA+tSNE is able to create accurate low dimensional representations of patient RNA-seq data that can be used to determine similarities and differences between patient samples based on gene expression data. The 10% variance in clustering accuracy suggests the method is robust and may help inform research and treatment by more robustly classifying cancer samples and by identifying similarities between cancers, especially among underserved, misdiagnosed, and rare cancers. Citation Format: Michael Bocker, Mikhail G. Grushko, Katherine E. Arline. Toward improved cancer classification using PCA + tSNE dimensionality reduction on bulk RNA-seq data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 2708.