Abstract
BackgroundBecause of the large volume of data and the intrinsic variation of data intensity observed in microarray experiments, different statistical methods have been used to systematically extract biological information and to quantify the associated uncertainty. The simplest method to identify differentially expressed genes is to evaluate the ratio of average intensities in two different conditions and consider all genes that differ by more than an arbitrary cut-off value to be differentially expressed. This filtering approach is not a statistical test and there is no associated value that can indicate the level of confidence in the designation of genes as differentially expressed or not differentially expressed. At the same time the fold change by itself provide valuable information and it is important to find unambiguous ways of using this information in expression data treatment.ResultsA new method of finding differentially expressed genes, called distributional fold change (DFC) test is introduced. The method is based on an analysis of the intensity distribution of all microarray probe sets mapped to a three dimensional feature space composed of average expression level, average difference of gene expression and total variance. The proposed method allows one to rank each feature based on the signal-to-noise ratio and to ascertain for each feature the confidence level and power for being differentially expressed. The performance of the new method was evaluated using the total and partial area under receiver operating curves and tested on 11 data sets from Gene Omnibus Database with independently verified differentially expressed genes and compared with the t-test and shrinkage t-test. Overall the DFC test performed the best – on average it had higher sensitivity and partial AUC and its elevation was most prominent in the low range of differentially expressed features, typical for formalin-fixed paraffin-embedded sample sets.ConclusionsThe distributional fold change test is an effective method for finding and ranking differentially expressed probesets on microarrays. The application of this test is advantageous to data sets using formalin-fixed paraffin-embedded samples or other systems where degradation effects diminish the applicability of correlation adjusted methods to the whole feature set.
Highlights
Because of the large volume of data and the intrinsic variation of data intensity observed in microarray experiments, different statistical methods have been used to systematically extract biological information and to quantify the associated uncertainty
This paper presents the description of a method, called the distributional fold change (DFC) test, which is based on the analysis of the distribution of intensities of all features on a microarray mapped to a three dimensional feature space composed of the average difference of gene expression, total variance and average expression level
Data sets We evaluated the performance of the DFC test using 11 publicly available Homo sapiens microarray data sets, listed in Table 1, each of which have had a portion of discovered Differentially expressed gene (DEG) experimentally validated by a real-time polymerase chain reaction (RT-PCR)
Summary
Because of the large volume of data and the intrinsic variation of data intensity observed in microarray experiments, different statistical methods have been used to systematically extract biological information and to quantify the associated uncertainty. By nature of the fixation method, FFPE samples are partially degraded and contain low amounts of total RNA ([10] and references therein for more details) leading to increased expression variability [10,11] This RNA degradation is dependent on a number of factors, including fixation protocol, storage time and storage conditions with the resulting variability introducing a number of challenges for gene expression studies [10,11]. The development of a method dedicated to the analysis of RNA differential expression from FFPE samples is necessary to support the many studies attempting to make discoveries from the wealth of FFPE archival material available. The absence of such a method is especially surprising in the view of enormous improvement of the methods and protocols for the extraction of RNA from FFPE samples in recent years [16]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have