Abstract Purpose: Develop a shiny application to help integrate cancer datasets and guide researchers in selecting an appropriate method of correction for their technical artifacts. Description: Integrative analysis of heterogeneous expression data remains challenging due to variations in platform, RNA quality, sample processing, and other unknown technical effects. As the field performs omics profiling of samples from cancer patients and murine models, there is the need for harmonizing, identifying, and correcting these technical effects to ensure robust analysis on the treatment and conditional effects of the underlying genetics or biological events. However, selecting and implementing different approaches for removing unwanted batch effects can be a time-consuming and tedious process, especially for more biologically focused investigators. In this project, we present Shiny BATCH-FLEX, a Shiny app to rapidly visualize batch correction by established batch correction methods such as ComBat, Mean Centering, ComBatSeq, and Limma RemoveBatchEffect. With BATCH-FLEX, users can visualize the contribution of variance of a factor before and after correction using principal component analysis, relative log expression plots, heatmaps, and explanatory variables. Users can also save all plots and matrices as a single ZIP file for further downstream analysis. Results: As a proof of concept, we assessed BATCH-FLEX using simulated data generated from a linear model framework introduced by Gagnon-Bartsch and Speed, which assumes that gene expression measurements can be distilled to a combination of the biological signal, systemic nose, and random noise. BATCH-FLEX was able to successfully identify and remove the introduced effect using each of the batch correction methods listed above. Next, we evaluated BATCH-FLEX using a comprehensive collection of bladder cancer data consisting of microarray data from 13 studies spanning 1452 samples. Following the cleaning of study-dependent noise, BATCH-FLEX was successful in revealing the heterogeneity among bladder cancer based on known sample type annotations. Conclusion: We have developed BATCH-FLEX, a tool for oncologic researchers to rapidly assess, select, and implement commonly used batch correction methods. This tool is available at https://github.com/shawlab-moffitt/BATCH-FLEX. Our integrative web portal of a Bladder Cancer Resource for Translational Science (BEACON) will also be shared at the meeting. Citation Format: Joshua Davis, Alyssa Obermayer, Thac Duong, Rebecca Hesterberg, Xuefeng Wang, Mingxiang Teng, G. Daniel Grass, Timothy Shaw. BATCH-FLEX: Feature-level equalization of x-batch in heterogeneous cancer data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 7423.
Read full abstract