PCTA, a pan-cancer cell line transcriptome atlas

Siyuan Cheng,Lin Li,Xiuping Yu

doi:10.1016/j.canlet.2024.216808

Abstract

A substantial volume of RNA sequencing data have been generated from cancer cell lines. However, it requires specific bioinformatics skills to compare gene expression levels across cell lines. This has hindered non-bioinformaticians from fully utilizing these valuable datasets in their research. To bridge this gap, we established a curated Pan-cancer Cell Line Transcriptome Atlas (PCTA) dataset. This resource aims to provide a user-friendly platform, allowing researchers without extensive bioinformatics expertise to access and leverage the wealth of information within the dataset for their studies. The PCTA dataset encompasses the expression matrix of 24,965 genes, featuring data from 84,385 samples derived from 5677 studies. This comprehensive compilation spans 535 cell lines, representing a spectrum of 114 cancer types originating from 30 diverse tissue types. On UMAP plots, cell lines originating from the same type of tissue tend to cluster together, illustrating the dataset's ability to capture biological relationships. Additionally, an interactive and user-friendly web application (https://pcatools.shinyapps.io/PCTA_app/) was developed for researchers to explore the PCTA dataset. This platform allows users to examine the expression of their genes of interest across a diverse array of samples.

Full Text