Abstract
In detection of differentially expressed (DE) genes between different groups of samples based on a high-throughput expression measurement system, we often use a classical statistical testing based on a simple assumption that the expression of a certain DE gene in one group is higher or lower in average than that in the other group. Based on this simple assumption, the theory of optimal discovery procedure (ODP) (Storey, 2005) provided an optimal thresholding function for DE gene detection. However, expression patterns of DE genes over samples may have such a structure that is not exactly consistent with group labels assigned to the samples. Appropriate treatment of such a structure can increase the detection ability. Namely, genes showing similar expression patterns to other biologically meaningful genes can be regarded as statistically more significant than those showing expression patterns independent of other genes, even if differences in mean expression levels are comparable. In this study, we propose a new statistical thresholding function based on a latent variable model incorporating expression patterns together with the ODP theory. The latent variable model assumes hidden common signals behind expression patterns over samples and the ODP theory is extended to involve the latent variables. When applied to several gene expression data matrices which include cluster structures or 'cancer outlier' structures, the newly-proposed thresholding functions showed prominently better detection performance of DE genes than the original ODP thresholding function did. We also demonstrate how the proposed methods behave through analyses of real breast cancer and lymphoma datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.