Abstract

Next-generation sequencing technologies have made RNA sequencing (RNA-seq) a popular choice for measuring gene expression level. To reduce the noise of gene expression measures and compare them between several conditions or samples, normalization is an essential step to adjust for varying sample sequencing depths and other unwanted technical effects. In this paper, we develop a novel global scaling normalization method by employing the available knowledge of housekeeping genes. We formulate the problem from the hypothesis testing perspective and find an optimal scaling factor that minimizes the deviation between the empirical and the nominal type I error. Applying our approach to various simulation studies and real examples, we demonstrate that it is more accurate and robust than the state-of-the-art alternatives in detecting differentially expression genes.

Highlights

  • In recent years, next-generation sequencing methods, for instance, ChIP-seq and RNA sequencing (RNA-seq), due to their distinct advantages in increasing specificity and sensitivity of gene expression, they have become a poular choice in biological studies

  • In order to compare the genes expression and to detect differently expressed genes between samples, normalization is a crucial step for downstream analysis

  • Assuming the information of housekeeping genes is known, we propose a novel normalization method called hypothesis testing based normalization (HTN), which is based on a hypothesis testing, and show it is more effective and robust for normalizing the RNA-seq depth between different samples

Read more

Summary

Introduction

Next-generation sequencing methods, for instance, ChIP-seq and RNA-seq, due to their distinct advantages in increasing specificity and sensitivity of gene expression, they have become a poular choice in biological studies. A conventional way of RNA-seq analysis is to standardize the data between samples by scaling their total number of reads to a common value. The estimated normalization scaling factor is expected to be stable for different confidence level in the hypothesis testing. It can normalize the samples without trimming the data and avoids the problem of the TMM like methods.

Materials and Methods
Iðpg g2H ðcÞ
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.