RE-EM trees: a data mining approach for longitudinal and clustered data

Rebecca J Sela,Jeffrey S Simonoff

doi:10.1007/s10994-011-5258-3

Abstract

Longitudinal data refer to the situation where repeated observations are available for each sampled object. Clustered data, where observations are nested in a hierarchical structure within objects (without time necessarily being involved) represent a similar type of situation. Methodologies that take this structure into account allow for the possibilities of systematic differences between objects that are not related to attributes and autocorrelation within objects across time periods. A standard methodology in the statistics literature for this type of data is the mixed model, where these differences between objects are represented by so-called effects that are estimated from the data (population-level relationships are termed fixed effects, together resulting in a mixed model). This paper presents a methodology that combines the structure of mixed models for longitudinal and clustered data with the flexibility of tree-based estimation methods. We apply the resulting estimation method, called the RE-EM tree, to pricing in online transactions, showing that the RE-EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random and regression trees without random effects. We also apply it to a smaller data set examining accident fatalities, and show that the RE-EM tree strongly outperforms a tree without random while performing comparably to a linear model with random effects. We also perform extensive simulation experiments to show that the estimator improves predictive performance relative to regression trees without random and is comparable or superior to using linear models with random in more general situations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

RE-EM trees: a data mining approach for longitudinal and clustered data

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Jul 13, 2011
Citations: 186

Similar Papers

Unbiased Regression Trees for Longitudinal Data
Jeffrey S. Simonoff ... Wei Fu
SSRN Electronic Journal | VOL. -
Jeffrey S. Simonoff, et. al.Jeffrey S. Simonoff ... Wei Fu
30 Nov 2014
SSRN Electronic Journal | VOL. -

A regression tree method for longitudinal and clustered data with multivariate responses
Wenbo Jing ... Jeffrey S Simonoff
Journal of Statistical Computation and Simulation | VOL. 94
Wenbo Jing, et. al.Wenbo Jing ... Jeffrey S Simonoff
25 Oct 2023
Journal of Statistical Computation and Simulation | VOL. 94

Gradient boosting for linear mixed models.
Colin Griesbach ... Elisabeth Waldmann
The International Journal of Biostatistics | VOL. 17
Colin Griesbach, et. al.Colin Griesbach ... Elisabeth Waldmann
13 Jan 2021
The International Journal of Biostatistics | VOL. 17

Equivalence of conditional and marginal regression models for clustered and longitudinal data
John Ritz ... Donna Spiegelman
Statistical Methods in Medical Research | VOL. 13
John Ritz, et. al.John Ritz ... Donna Spiegelman
01 Aug 2004
Statistical Methods in Medical Research | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RE-EM trees: a data mining approach for longitudinal and clustered data

Abstract

Talk to us

Similar Papers

More From: Machine Learning