Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.

Pierre R Bushel,Stephen S Ferguson,Richard S Paules,Sreenivasa C Ramaiahgari,Scott S Auerbach

doi:10.3389/fgene.2020.00594

Pierre R Bushel, Stephen S Ferguson + Show 3 more

Open Access

https://doi.org/10.3389/fgene.2020.00594

Copy DOI

Abstract

Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.

Highlights

Over the past 25 years, interrogation of genome-wide gene expression has taken many forms. cDNA and oligonucleotide microarrays (Millen and Glauser, 1978; Lockhart et al, 1996) analysis methods matured over time whereby preprocessing of the data for single-channel microarrays defaulted to the de facto Robust Multichip Average (RMA) normalization (Irizarry et al, 2003a,b)
Crowdsourcing bioinformatics analysis of RNA sequencing (RNA-Seq) data through the US Food and Drug Administration MicroArray Quality Control (MAQC), SEquence Quality Control (SEQC) phase effort led to a comprehensive assessment of RNA-Seq analysis including comparison to microarray and normalization using External RNA Control Consortium (ERCC) spike-in controls (Consortium, 2014; Risso et al, 2014; Wang et al, 2014; Xu et al, 2014)
We show that based on sensitivity and specificity performance measures as well as the adjusted Rand index (ARI) as a measure of agreement, Upper Quartile (UQ) performed the best with respect to maintaining absolute fold change (FC) levels ≥2.0 as detected in a two-group comparison

Summary

Introduction

Over the past 25 years, interrogation of genome-wide gene expression has taken many forms. cDNA and oligonucleotide microarrays (Millen and Glauser, 1978; Lockhart et al, 1996) analysis methods matured over time whereby preprocessing of the data for single-channel microarrays defaulted to the de facto Robust Multichip Average (RMA) normalization (Irizarry et al, 2003a,b). Over the past 25 years, interrogation of genome-wide gene expression has taken many forms. In the last few years, targeted sequencing of RNA has emerged as a practical means of capturing the totality of the transcriptomic space with less reliance on large resources for consumables and bioinformatics (Li et al, 2012). The TempO-SeqTM technology from BioSpyderTM is a templated, multiplexed RNA-Seq platform that measures the expression of sentinel genes representative of genome-wide transcription (Yeakley et al, 2017; Mav et al, 2018). A few advantages of TempO-Seq over RNA-Seq is that it does not require RNA purification, cDNA synthesis, nor capture of targeted RNA. There has not been a comprehensive comparison of normalization methods applied to TempO-Seq data

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Genetics	Publication Date: Jun 23, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

CaSpER identifies and visualizes CNV events by integrative analysis of single-cell or bulk RNA-sequencing data
Akdes Serin Harmanci ... Arif O Harmanci
Nature Communications | VOL. 11
Akdes Serin Harmanci, et. al.Akdes Serin Harmanci ... Arif O Harmanci
03 Jan 2020
Nature Communications | VOL. 11

Abstract 1211: Allele-specific copy-number based deconvolution of bulk tumour RNA sequencing data from the TRACERx study
Carla Castignani ... David R Pearce
Cancer Research | VOL. 82
Carla Castignani, et. al.Carla Castignani ... David R Pearce
15 Jun 2022
Cancer Research | VOL. 82

Integrative analysis of bulk and single-cell RNA sequencing data reveals distinct subtypes of MAFLD based on N1-methyladenosine regulator expression
Jinyong He ... Cong Du
Liver Research | VOL. 7
Jinyong He, et. al.Jinyong He ... Cong Du
01 Jun 2023
Liver Research | VOL. 7

Statistical Methods for RNA Sequencing Data Analysis
Dongmei Li
-
Dongmei LiDongmei Li
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics