Abstract

BackgroundIn functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes.ResultsIn this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment-wide significant MVDE genes.ConclusionsOur results indicate tremendous potential gain of integrating informative variance heterogeneity after adjusting for global confounders and background data structure. The proposed informative integration test better summarizes the impacts of condition change on expression distributions of susceptible genes than do the existent competitors. Therefore, particular attention should be paid to explicitly exploit the variance heterogeneity induced by condition change in functional genomics analysis.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1393-y) contains supplementary material, which is available to authorized users.

Highlights

  • In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions

  • The null independence between the mean and variance heterogeneity tests It’s commonly believed that testing statistics of mean and variance heterogeneities are dependently distributed, even if the data forming them are from an identical normal population

  • This null independence lays the foundation of type I error rate control of the integrative heterogeneity tests

Read more

Summary

Introduction

Tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is neglected or calibrated for as an impediment. The necessity of H02 for the ST was formally examined under normality setting [4] It tends to inflate type I error rate for rejecting mean equality if the smaller sample is from the population with the larger variance. The WT [5] is an adaptation of the ST to allow for potential variance heterogeneity between two experimental conditions This test calibrates potential variance heterogeneity as an impediment to identify differentially expressed genes. All aforesaid tests either ignore or take the variance heterogeneity as an impediment and calibrate it when identifying differentially expressed genes

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call