Abstract

Ultra high-throughput sequencing of transcriptomes (RNA-Seq) is a widely used method for quantifying gene expression levels due to its low cost, high accuracy, and wide dynamic range for detection. However, the nature of RNA-Seq makes it nearly impossible to provide absolute measurements of transcript abundances. Several units or data summarization methods for transcript quantification have been proposed in the past to account for differences in transcript lengths and sequencing depths across different genes and different samples. Nevertheless, further between-sample normalization is still needed for reliable detection of differentially expressed genes. In this paper, we propose a unified statistical model for joint detection of differential gene expression and between-sample normalization. Our method is independent of the unit in which gene expression levels are summarized. We also introduce an efficient algorithm for model fitting. Due to the L0-penalized likelihood used in our model, it is able to reliably normalize the data and detect differential gene expression in some cases when more than 50% of the genes are differentially expressed in an asymmetric manner. We compare our method with existing methods using simulated and real data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call