Abstract RNA-Seq is currently the most prevailing method for measuring transcriptional activities in cells and tissues. It relies on high-quality RNA in order to yield reliable and reproducible results, which is often challenging due to RNA degradation during sample collection and processing. Agilent’s RNA Integrity Number (RIN) is a commonly adopted standard for evaluating RNA quality in NGS workflows. However, while most RNA-Seq experiments are geared towards the quantification of mRNA, the RIN metric heavily relies on the amount of 18S and 28S ribosomal RNA and does not directly measure the integrity of mRNA. To overcome this limitation, researchers have proposed several post-alignment measures of transcript integrity, such as mRIN (mRNA Integrity Number), TIN (Transcript Integrity Number, from RSeQC package), and DI (Degradation Index, from DegNorm package), but so far there is no consensus as to which works best. It is also unclear to what extent RNA degradation impacts the results of downstream analysis when samples with suboptimal RNA quality are included. To answer these questions in the context of cancer research, we analyzed 198 RNA-Seq samples from 7 syngeneic mouse tumor models of different cancer types (4T1, CT26, EL4, E.G7-OVA, H22, Hepa1-6, and KLN205) with a wide range of RIN values (2.3 to 9.8). Interestingly, we found a high concordance between RIN and medTIN (median TIN score) in 4T1, E.G7-OVA, Hepa1-6, and KLN205 samples, but a surprisingly low concordance in EL4 and H22 samples. A tentative interpretation is that, depending on the tissue/cell type, it is possible for an RNA sample to have heavily degraded ribosomal RNA (hence a low RIN value), while still retaining relatively intact mRNA (resulting in a decent medTIN value). Principal component analysis (PCA) revealed that both RIN and medTIN are strongly correlated with the strongest explanatory variable (PC1) of the transcriptome across all library types, confirming that RNA degradation can heavily bias the results of transcript-level DE (differential expression) analysis. We then evaluated the performance of TIN correction, a method proposed in conjunction with the TIN metric to correct for RNA-degradation bias, and found it to be largely ineffective on our dataset. Apparently, there is a need for better normalization/correction methods when a dataset consists of samples with wildly varying RNA quality. On the other hand, the impact of RNA degradation on gene-level DE analysis is much smaller. In PCA on gene-level data, the first 4 PCs, which altogether explain > 70% of the variation in the transcriptome, showed weak or no correlation with RIN or medTIN, and the samples cluster according to their respective tissue groups as expected. Our study calls for cautious analysis and interpretation of gene expression data from degraded RNA samples, and highlights a need for more suitable RNA quality metrics and bias correction methods. Citation Format: Yanghui Sheng, Wubin Qian, Xiaobo Chen, Henry Q. Li, Sheng Guo. Impact of RNA degradation on transcriptomic profiling in tumors samples from syngeneic mouse models [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 3371.
Read full abstract