Abstract
Most computer clustering programs available today use global de-noising algorithms to filter raw gene expression pattern data. However, many gene expression time series data contain interval-dependent noise where the second moment statistics of the noise are non-stationary. To address this issue we developed a new wavelet-based algorithm (Wave-SOM) that uses a localized filtering method (wavelets) to remove noise from the data while preserving local time events in the gene expression patterns. We employed a discrete Hilbert transform thresholding technique to compare the size of the signal component relative to the noise level at each wavelet transform level by creating a complex-valued analytic vector from which an amplitude vector was defined. Using various wavelet transformations, raw data are first de-noised by decomposing the time-series into low and high frequency wavelet coefficients. Following thresholding, the coefficients are fed as an input vector into a two-dimensional Self-Organizing-Map clustering algorithm. Transformed data are then clustered by minimizing the Euclidean (L2) distance between their corresponding fluctuation patterns. A multi-resolution analysis by Wave-SOM of expression data from the yeast Saccharomyces cerevisiae, exposed to oxidative stress and glucose-limited growth identified 29 genes with correlated expression patterns that were mapped into 5 different nodes. This ordered clustering of yeast genes by Wave-SOM illustrates the fact that the same set of genes (encoding ribosomal proteins) can be regulated by two different environmental stresses, oxidative stress and starvation. Using an adjusted Rand index measure to cluster expression patterns of yeast’ cell-cycle genes as test data sets, our algorithm outperformed the Cluster 3.0, MCLUST, CurveSOM , SSClust and GENECLUSTER clustering algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.