Abstract

BackgroundUsing next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups. Existing methods aim to identify genes whose mean expressions differ among treatment groups by assuming equal dispersion across all groups. For syndromes, however, various combinations of gene expression alterations can result in the same disease, leading to greater heteroscedasticity in the biological replicates in the disease group compared to the normal group. Traditional methods that only consider changes in the mean will fail to fully analyze gene expression in such a scenario. In addition, sequencing technology is relatively expensive; most labs can only afford a few replicates per treatment group, which poses further challenges to reliably estimating the mean and dispersion under each treatment condition.ResultsWe designed an empirical Bayes method and a pooled permutation test to simultaneously consider the change in mean and dispersion across treatment groups. We further computed confidence intervals based on Bayes estimates to identify differentially expressed genes that are unique to each disease sample as well as those that are common across all disease samples. We illustrated our method by applying it to gene expression data from a large offspring syndrome experiment, which motivated this study. We compared our method to competing approaches through simulation studies that mimicked the real datasets to demonstrate the effectiveness of our proposed method.ConclusionsWe will show that, compared to popular methods that only aim to find the difference in the mean, our method can capture greater variation in the disease group to effectively identify differentially expressed genes for syndromes.

Highlights

  • Using next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups

  • The objective of our analysis is to develop a statistical method for syndromes that is more powerful than existing methods of Differentially expressed (DE) gene detection between normal and disease groups

  • Results and discussion we demonstrate the performance of our proposed method using real biological experiment data and simulated datasets

Read more

Summary

Introduction

Using next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups. Existing methods aim to identify genes whose mean expressions differ among treatment groups by assuming equal dispersion across all groups. Sequencing technology is relatively expensive; most labs can only afford a few replicates per treatment group, which poses further challenges to reliably estimating the mean and dispersion under each treatment condition. Expressed (DE) genes often refer to genes whose mean expressions differ across treatment groups, such as normal versus disease groups. High-throughput technologies, such as microarray and next-genration sequencing (NGS) technology measure gene expression levels simultaneously for tens of thousands of genes in the whole genome. NGS is relatively expensive; most biology labs can only afford three or four replicates per treatment group. It is extremely challenging to accurately estimate expression means and error variances that are crucial to DE gene identification

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call