Abstract
Molecular phenotyping technologies (e.g., transcriptomics, proteomics, and metabolomics) offer the possibility to simultaneously obtain multivariate time series (MTS) data from different levels of information processing and metabolic conversions in biological systems. As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes. Therefore, it is important to develop methods for determining the time segments in MTS data, which may correspond to critical biochemical events reflected in the coupling of the system’s components. Here we provide a novel network-based formalization of the MTS segmentation problem based on temporal dependencies and the covariance structure of the data. We demonstrate that the problem of partitioning MTS data into segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved. To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments. Our empirical analyses of synthetic benchmark data as well as time-resolved transcriptomics data from the metabolic and cell cycles of Saccharomyces cerevisiae demonstrate that the proposed method accurately infers the phases in the temporal compartmentalization of biological processes. In addition, through comparison on the same data sets, we show that the results from the proposed formalization of the MTS segmentation problem match biological knowledge and provide more rigorous statistical support in comparison to the contending state-of-the-art methods.
Highlights
Time-resolved data from different cellular processes hold the promise of identifying the dynamics and relations of key system descriptors mapped into putative metabolic reactions, allosteric regulations, and entire signaling pathways
Yeast’s Metabolic and Cell Cycles Motivated by the accurate predictions from applying the framework on the synthetic data set, we investigated the multivariate time series (MTS) segmentation of transcriptomics data sets from the Saccharomyces cerevisiae metabolic cycle [36] (YMC), cell cycle [37] (YCC), and the experiment capturing the effect of oxidative stress, induced by hydrogen peroxide (HP), on the yeast’s cell cycle [38]
Analysis of MTS data can be used to identify the key biological processes involved in the adjustment of the cellular states
Summary
Time-resolved data from different cellular processes hold the promise of identifying the dynamics and relations of key system descriptors mapped into putative metabolic reactions, allosteric regulations, and entire signaling pathways. These data are usually referred to as multivariate time series (MTS) since high-throughput technologies allow for simultaneous monitoring of multiple biological entities (i.e., genes, proteins, metabolites) over time. Each segment is represented by either a single quantity, e.g., the mean/median of the time series elements in the segment or the slope of the line yielding the best fit [3]. The difference between a given segment and its representative is measured by using some distance measure d (e.g., Euclidean distance)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.