Abstract 2472: Benchmark of lncRNA quantification in RNA-Seq of cancer samples

Hong Zheng,Mikel Hernaez,Kevin Brennan,Olivier Gevaert

doi:10.1158/1538-7445.am2018-2472

Abstract

Abstract Introduction Long non-coding RNAs (lncRNAs) emerge as important regulators of various biological processes. Many lncRNAs with tumor-suppressor or oncogenic functions in cancer have been discovered. While many studies have exploited public resources such as RNA-Seq data in The Cancer Genome Atlas (TCGA) to study lncRNAs in cancer, it is crucial to choose the optimal method for accurate lncRNAs expression quantification. Multiples tools for processing RNA-Seq data have spurred in recent years, however, there is no accepted gold standard pipeline yet for optimal quantification of lncRNAs. Therefore, we aim to evaluate the performance of popular RNA-Seq analysis tools and recommend the best practice for RNA-Seq analysis of lncRNAs. Methods In this benchmarking study, we compared the performance of pseudoalignment methods Kallisto and Salmon, and alignment-based methods HTSeq, featureCounts, and RSEM, by applying them to a simulated RNA-Seq dataset with 63 samples, and a pan-cancer RNA-Seq dataset with 210 samples from TCGA. GENCODE release 25 were used as transcriptome reference. All the scripts were put in GitHub (zhengh42/RNASeq_pipeline). Results Pseudoalignment methods Kallisto and Salmon detect more lncRNAs than alignment-based methods and correlate highly with simulated ground truth. On the contrary, alignment-based methods tend to underestimate lncRNA expression or even fail to capture lncRNA signal in the ground truth. These underestimated genes include several cancer-relevant lncRNAs such as TERC and ZEB2-AS1. Besides the high concordance with ground truth, pseudoalignment methods take less CPU time per sample. They are also flexible with both gene-level and transcript-level quantification, while HTSeq and featureCounts are suitable for gene-level, but not transcript-level analysis. Overall, 10-16% of lncRNAs can be detected in the samples, with antisense and lincRNAs the two most abundant categories. A higher proportion of antisense RNAs are detected than lincRNAs. Moreover, among the expressed lncRNAs, more antisense RNAs are discordant from ground truth (Spearman's correlation less than 0.7) than lincRNAs when measured by alignment-based methods, indicating that antisense RNAs are more susceptible to mis-quantification. In addition, the lncRNAs with fewer transcripts, less than three exons, and lower sequence uniqueness tend to be more discordant. Finally, incomplete annotation overestimates expression of both lncRNAs and protein-coding genes. Full transcriptome annotation, including both protein coding and noncoding RNAs, greatly improves the specificity of lncRNA expression quantification. Conclusions In summary, considering the concordance with ground truth, flexibility with both genes and transcripts analysis, and the running time, pseudoalignment methods Kallisto or Salmon in combination with the full transcriptome annotation is our recommended strategy for RNA-Seq analysis for lncRNAs. Citation Format: Hong Zheng, Mikel Hernaez, Kevin Brennan, Olivier Gevaert. Benchmark of lncRNA quantification in RNA-Seq of cancer samples [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2472.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Abstract 2472: Benchmark of lncRNA quantification in RNA-Seq of cancer samples

Abstract

Talk to us

Similar Papers

More From: Cancer Research

Lead the way for us

Similar Papers

Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples.
Hong Zheng ... Kevin Brennan
GigaScience | VOL. 8
Hong Zheng, et. al.Hong Zheng ... Kevin Brennan
01 Dec 2019
GigaScience | VOL. 8

Pan-cancer analysis of non-coding transcripts reveals the prognostic onco-lncRNA HOXA10-AS in gliomas.
Keren Isaev ... Shuai Wu
Cell Reports | VOL. 37
Keren Isaev, et. al.Keren Isaev ... Shuai Wu
01 Oct 2021
Cell Reports | VOL. 37

An integrated analysis reveals the oncogenic function of lncRNA LINC00511 in human ovarian cancer.
Jing Wang ... Yan Ding
Cancer Medicine | VOL. 8
Jing Wang, et. al.Jing Wang ... Yan Ding
23 Apr 2019
An integrated analysis reveals the oncogenic function of lncRNA LINC00511 in human ovarian cancer.
Jing Wang ... Yan Ding

Abstract 3372: Comprehensive analysis of estrogen-regulated testis specific lncRNAs in breast cancer
Enrique I Ramos ... Barbara Yang
Cancer Research | VOL. 82
Enrique I Ramos, et. al.Enrique I Ramos ... Barbara Yang
15 Jun 2022
Abstract 3372: Comprehensive analysis of estrogen-regulated testis specific lncRNAs in breast cancer
Enrique I Ramos ... Barbara Yang

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Abstract 2472: Benchmark of lncRNA quantification in RNA-Seq of cancer samples

Abstract

Talk to us

Similar Papers

More From: Cancer Research