Abstract

BackgroundRecently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators.ResultsIn this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387–404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783–802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods.ConclusionThe confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0664-3) contains supplementary material, which is available to authorized users.

Highlights

  • The Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels

  • Setting 1 (Two conditions) Fifty simulations were used in this study to investigate the performance of different versions of the confident difference criterion methods described in the Confident difference criterion section

  • In this paper, we have proposed a new differentially expressed gene selection algorithm, which controls the false discovery rate (FDR) based on predictive Bayesian estimates

Read more

Summary

Introduction

The Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. The high-throughput technologies offer us rich biological information, they are highly error-prone because many genes are monitored at the same time with a relatively small sample size. Bayesian methods provide a good solution to this problem because they synthesize all the data by borrowing information across different genes and produce more efficient estimators for evaluating the gene expressions. They include linear models in LIMMA [28] where empirical Bayesian methods were used to obtain stable results even with small sample size. Other Bayesian methods for RNA-Seq studies using generation sequencing were reviewed by Kvam et al [16] and Soneson and Delorenzi [29]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.