Abstract Background and Aims Cultured kidney cell lines are broadly used to model the physiology and pathophysiology of the kidney. Due to immortalization, passaging and culture conditions, a cell line's gene expression profile might differ greatly from the primary cells it was originally derived from. This might be further enhanced by experimental treatments with compounds or genetic modifications. It is therefore a relevant question if a given cell culture system is still the appropriate model for a certain disease or scientific investigations. Reasoning that RNA sequencing (RNA-seq) based transcriptomes are generated as part of many experimental setups and hence may be used to address this question, we developed and tested an RNA-seq based approach, CellMatchR. This approach matches kidney cell lines, primary cells and even tissue specimen to known kidney reference cell types to determine which primary kidney cell type is the most similar and to what degree global gene expression or the expression of selected marker genes differs. Method CellMatchR uses published murine and human kidney single cell- and tubule-level transcriptomic datasets from healthy mice and human donors as references. Single cell datasets were further processed to pseudobulk references for each of the contained cell types. Then RNA-seq data from cell lines or tissues of interest (test data) was compared to the reference (pseudo)bulk data with Spearman correlations of gene counts-per-million values (CPM) or Euclidean distance using log-transformed CPM values. Both approaches were systematically tested for various combinations of test and reference datasets across different species and using global gene expression (i.e. all genes measured in reference and test datasets) and a set of 315 manually curated, kidney cell type marker genes. Results We sequentially used different kidney tissue types, primary cells and cell lines as positive controls and matched them to our reference datasets. Spearman correlations of gene expression rankings showed the highest correlation coefficients with biologically correct reference cell types (examples in Fig. 1A-C). We found Spearman correlations to be superior to Euclidean distances as method of comparison. Analyses based on global gene expression compared to our curated set of 315 kidney cell type marker genes yielded similar results. Notably, correlation coefficients were higher and showed less variation between reference cell types when using the global gene set. Matchings across species, i.e. using murine test and human reference data or vice versa still yielded mostly correct results. However, correlation coefficients were generally lower (i.e. rho = 0.9 vs. 0.6) and varied more across reference datasets if comparisons were performed across species. Using published RNA-seq data for different cell lines of the proximal tubule (e.g. human HK-2 and opossum OKH cells) we observed low similarities with proximal tubule cells (Fig. 1D), whereas two tested cell lines of the collecting duct (mIMCD-3 and mpkCCD cells) plausibly showed the highest similarity to medullary cell types like collecting duct and Loop of Henle cells. The most similar reference cell types did not change in mIMCD-3 cells when kept in 2-dimensional vs. 3-dimensional culture conditions, and in mpkCCD cells when the osmolality of the culture medium was changed from 300 to 600 mosmol/kg. Also, a Pkd1 knockout in mIMCD-3 cells did not change the most similar reference cell type in our analyses (Fig. 1E and F). Conclusion Our CellMatchR approach uses publicly available kidney single cell and bulk RNA-seq datasets and combines these with simple, computationally fast yet effective statistical methods to determine similarities of kidney cell lines and tissue samples to reference cell types of interest. It can easily be implemented by trained users or integrated into online resources for usage with a visual interface. It relies on RNA-seq data from the cell lines of interest and hence presents a feasible complementary method to check the general similarity of cell culture models to kidney cell types and assess their stability across experimental conditions.
Read full abstract