Abstract
The traditional approach with microarray data has been to apply transformations that approximately normalize them, with the drawback of losing the original scale. The alternative standpoint taken here is to search for models that fit the data, characterized by the presence of negative values, preserving their scale; one advantage of this strategy is that it facilitates a direct interpretation of the results. A new family of distributions named gpower-normal indexed by is introduced and it is proven that these variables become normal or truncated normal when a suitable gpower transformation is applied. Expressions are given for moments and quantiles, in terms of the truncated normal density. This new family can be used to model asymmetric data that include non-positive values, as required for microarray analysis. Moreover, it has been proven that the gpower-normal family is a special case of pseudo-dispersion models, inheriting all the good properties of these models, such as asymptotic normality for small variances. A combined maximum likelihood method is proposed to estimate the model parameters, and it is applied to microarray and contamination data. R codes are available from the authors upon request.
Highlights
While analysing microarray intensity measurements, it is usual to find asymmetric distributions with some negative values and the purpose of this article is to model data with these characteristics.The traditional approach with microarray data has been to apply transformations that approximately normalize them, with the drawback of losing the original scale.The initial transformation applied was log2 ; it allows working with log-ratios which have a simple and intuitive meaning for biologists
One advantage of this strategy is that it facilitates a direct interpretation of the results. In this direction, [11] showed that data that become normal after a glog transformation belong to what they called the glog-normal distribution family. We extend their results by characterizing those distributions that become normal after a gpower transformation
We introduce the gpower-normal family; this family of distributions should be fitted to gene intensities that have been previously calibrated with an affine transformation, according to the Bengtsson and Hössjer proposal [10]
Summary
The initial transformation applied was log ; it allows working with log-ratios which have a simple and intuitive meaning for biologists (see for example [1,2]). This transformation usually works well for high values but not for zero, and low ones. It cannot be applied to negative values To avoid these drawbacks, [3,4] suggested the generalized logarithm transformation (glog), that allows negative values and this transformation is obtained from a multiplicative–additive linear error model for the data, through a Taylor approximation
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.