Abstract
IntroductionIntegrative analysis of multiple data sets can provide complementary information about the studied biological system. However, data fusion of multiple biological data sets can be complicated as data sets might contain different sources of variation due to underlying experimental factors. Therefore, taking the experimental design of data sets into account could be of importance in data fusion concept.ObjectivesIn the present work, we aim to incorporate the experimental design information in the integrative analysis of multiple designed data sets.MethodsHere we describe penalized exponential ANOVA simultaneous component analysis (PE-ASCA), a new method for integrative analysis of data sets from multiple compartments or analytical platforms with the same underlying experimental design.ResultsUsing two simulated cases, the result of simultaneous component analysis (SCA), penalized exponential simultaneous component analysis (P-ESCA) and ANOVA-simultaneous component analysis (ASCA) are compared with the proposed method. Furthermore, real metabolomics data obtained from NMR analysis of two different brains tissues (hypothalamus and midbrain) from the same piglets with an underlying experimental design is investigated by PE-ASCA.ConclusionsThis method provides an improved understanding of the common and distinct variation in response to different experimental factors.
Highlights
Integrative analysis of multiple data sets can provide complementary information about the studied biological system
We investigate the performance of the PEASCA for identification of the common and distinct components contributed to each designed factor
The simulated data sets are generated according to Eq (11)
Summary
Integrative analysis of multiple data sets can provide complementary information about the studied biological system. Even though data fusion is crucial for understanding the relationship between different data sets, current data fusion methods (Alter et al 2003; Gaynanova and Li 2017; Lock et al 2013; Schouteden et al 2013; Song et al 2019; Trygg and Wold 2003) do not take the experimental complexity of the data sets into account This means that different sources of variation explaining the contribution of the experimental design are not described in the data fusion models, which could hamper an easy interpretation of the common and distinct sources of variation. ASCA can decompose the data to low-rank submatrices that can be assigned to different experimental factors and their interactions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.