Abstract
AbstractNowadays, life science experiments—and especially “omics” fields—often imply a high volume of information from high throughput technologies that is gathered in the form of a wide and short multivariate response. These data are intrinsically correlated and generally produced by another multivariate set of factors or continuous variables, collected in what is defined as the design matrix. Such design factors usually involve the presence of a treatment, but other sources of biological or technical variability in the data are often measured as well. The ASCA framework, based on ANOVA and PCA, leads to promising results. By combining dimension reduction projection methods and classic statistical modelling, it enables to decipher the main sources of variability in the produced response and offers attractive graphical representations of the factors' effect. However, this approach has not yet been extended to more advanced designs involving random factors, being typically involved in longitudinal, hierarchical, or repeatability/reproducibility studies. This paper has its roots in the GLM version of ASCA, called ASCA+, that leads to unbiased estimators of the factors' effects for unbalanced data. It is here extended by replacing GLM by LMM and adapting the methodology. Taking into account the error structure of the data indeed leads to more accurate data modelling and more generalisable results. The suggested methodology is applied to two experimental case studies that highlight the benefits of this approach as it leads to a refined data analysis with interesting inferential properties, while keeping the powerful visualisation outputs produced by ASCA.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have