Abstract
Biomedical research typically involves longitudinal study designs where samples from individuals are measured repeatedly over time and the goal is to identify risk factors (covariates) that are associated with an outcome value. General linear mixed effect models are the standard workhorse for statistical analysis of longitudinal data. However, analysis of longitudinal data can be complicated for reasons such as difficulties in modelling correlated outcome values, functional (time-varying) covariates, nonlinear and non-stationary effects, and model inference. We present LonGP, an additive Gaussian process regression model that is specifically designed for statistical analysis of longitudinal data, which solves these commonly faced challenges. LonGP can model time-varying random effects and non-stationary signals, incorporate multiple kernel learning, and provide interpretable results for the effects of individual covariates and their interactions. We demonstrate LonGP’s performance and accuracy by analysing various simulated and real longitudinal -omics datasets.
Highlights
Biomedical research typically involves longitudinal study designs where samples from individuals are measured repeatedly over time and the goal is to identify risk factors that are associated with an outcome value
We develop a fully-Bayesian, predictive inference for LonGP and use that to carry out model selection, i.e. to identify covariates that are associated with a given study outcome value
Gaussian processes (GP) are a flexible class of models that have become popular in machine learning and statistics
Summary
Biomedical research typically involves longitudinal study designs where samples from individuals are measured repeatedly over time and the goal is to identify risk factors (covariates) that are associated with an outcome value. Analysis of longitudinal data can be complicated for reasons such as difficulties in modelling correlated outcome values, functional (time-varying) covariates, nonlinear and non-stationary effects, and model inference. Longitudinal studies are effective in identifying various risk factors that are associated with an outcome, such as disease initiation, disease onset or any disease-associated molecular biomarker. Numerous advanced extensions of these statistical techniques have been proposed, longitudinal data analysis is still complicated for several reasons, such as difficulties in choosing covariance structures to model correlated outcomes, handling irregular sampling times and missing values, accounting for time-varying covariates, choosing appropriate nonlinear effects, modelling non-stationary (ns) signals, and accurate model inference.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.