Abstract
With microarray technology, gene expression profiles are produced at a rapid rate. It remains a challenge for biologists to robustly identify periodic gene expression profiles when the time series have short data length and contain a high level of noise. An effective method is proposed in this paper to analyze the periodicity of gene expression time series us- ing singular value decomposition (SVD), singular spectrum analysis (SSA) and autoregressive (AR) model-based spec- tral estimation. Using these procedures, noise can be filtered out and over 85% of periodic gene expression can be identi- fied in the mouse segmentation clock data set. AMS 2000 subject classifications: Primary 60K35, 60K35; secondary 60K35. Keywords and phrases: Singular value decomposition (SVD), Singular spectrum analysis (SSA), Segmentation clock, Periodicity analysis, Microarray time series analysis. Transcription profiling of mouse presomitic mesoderm with 17 samples at different time points is carried out to iden- tify periodic genes of the segmentation clock (1). Based on this dataset, Dequeant (3) carried out a research study com- paring the pattern detection performance of several math- ematical approaches, which included the Lomb-Scargle (L) periodogram, Phase consistency (P), Address reduction (A), Cyclohedron test (C), and Stable persistence (S). The top three hundred ranked probe sets from these five methods were found and the results show that the Stable persistence (S) method performs best by identifying most of the bench- mark probe sets within the top 300 probe sets. However, the data contain a high level of noise, which will degrade the performance of most data analysis algorithms. There- fore, we need to develop an effective method to process the noisy time series data. In this paper, an effective method is developed to iden- tify the periodicity of microarray time series data by com- bining singular value decomposition (SVD), singular spec- trum analysis (SSA) and autoregressive (AR) model-based spectral analysis. By considering the singular values of time series data, the noise can be reduced (4). By using AR mod- eling, more accurate spectral estimation results are obtained (5). In our work, about 85% of gene expression profiles in the mouse segmentation clock dataset are found to be periodic. 2. METHODS
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.