Influence.ME: Tools for Detecting Influential Data in Mixed Effects Models

Rense Nieuwenhuis,Manfred,Te Grotenhuis,Ben Pelzer

doi:10.32614/rj-2012-011

Abstract

influence.ME provides tools for de- tecting influential data in mixed effects mod- els. The application of these models has become common practice, but the development of diag- nostic tools has lagged behind. influence.ME calculates standardized measures of influential data for the point estimates of generalized mixed effects models, such as DFBETAS, Cook's dis- tance, as well as percentile change and a test for changing levels of significance. influence.ME calculates these measures of influence while ac- counting for the nesting structure of the data. The package and measures of influential data are introduced, a practical example is given, and strategies for dealing with influential data are suggested. The application of mixed effects regression models has become common practice in the field of social sci- ences. As used in the social sciences, mixed effects re- gression models take into account that observations on individual respondents are nested within higher- level groups such as schools, classrooms, states, and countries (Snijders and Bosker, 1999), and are often referred to as multilevel regression models. Despite these models' increasing popularity, diagnostic tools to evaluate fitted models lag behind. We introduce influence.ME (Nieuwenhuis, Pelzer, and te Grotenhuis, 2012), an R-package that provides tools for detecting influential cases in mixed effects regression models estimated with lme4 (Bates and Maechler, 2010). It is commonly accepted that tests for influential data should be performed on regression models, especially when estimates are based on a relatively small number of cases. How- ever, most existing procedures do not account for the nesting structure of the data. As a result, these existing procedures fail to detect that higher-level cases may be influential on estimates of variables measured at specifically that level. In this paper, we outline the basic rationale on de- tecting influential data, describe standardized mea- sures of influence, provide a practical example of the analysis of students in 23 schools, and discuss strate- gies for dealing with influential cases. Testing for influential cases in mixed effects regression models is important, because influential data negatively in- fluence the statistical fit and generalizability of the model. In social science applications of mixed mod- els the testing for influential data is especially im- portant, since these models are frequently based on large numbers of observations at the individual level while the number of higher level groups is relatively small. For instance, Van der Meer, te Grotenhuis, and Pelzer (2010) were unable to find any country-level comparative studies involving more than 54 coun- tries. With such a relatively low number of coun- tries, a single country can easily be overly influen- tial on the parameter estimates of one or more of the country-level variables.

Highlights

The application of mixed effects regression models has become common practice in the field of social sciences
We introduce influence.ME (Nieuwenhuis, Pelzer, and te Grotenhuis, 2012), an R-package that provides tools for detecting influential cases in mixed effects regression models estimated with lme4 (Bates and Maechler, 2010)
We outline the basic rationale on detecting influential data, describe standardized measures of influence, provide a practical example of the analysis of students in 23 schools, and discuss strategies for dealing with influential cases

Summary

Detecting Influential Data

All cases used to estimate a regression model exert some level of influence on the regression parameters. If a single case has extremely high or low scores on the dependent variable relative to its expected value — given other variables in the model, one or more of the independent variables, or both — this case may overly influence the regression parameters by ’pulling’ the estimated regression line towards itself. If a case has very extreme scores on the independent variable(s) but is fitted very well by a regression model, and if this case has a low deleted (standardized) residual, this case is not necessarily overly influencing the outcomes of the regression model. We introduce the measure of percentile change and a test for changing levels of significance of the fixed parameters Up to this point, this discussion on influential data was limited to how single cases can overly influence the point estimates (or BETAS) of a regression model. Inferences made to the population from models in which such cases are present may be incorrect

Detecting Influential Data in Mixed Effects Models

The Outcome Measures

Test for changes in significance

AIC BIC logLik deviance REMLdev

Visual Examination

Class structure level

Calculating measures of influence

School ID

Dealing with Influential Data

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The R journal	Publication Date: Jan 1, 2012
Citations: 312	License type: cc-by

R Discovery Prime

R Discovery Prime

Influence.ME: Tools for Detecting Influential Data in Mixed Effects Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The R journal

Lead the way for us

Similar Papers

Mixed Effects Models with Measurement Errors in Time-Dependent Covariates
Lang Wu ... Wei Liu
-
Lang Wu, et. al.Lang Wu ... Wei Liu
28 Sep 2021
28 Sep 2021

Modeling accuracy as a function of response time with the generalized linear mixed effects model
D.J Davidson ... A.E Martin
Acta psychologica | VOL. 144
D.J Davidson, et. al.D.J Davidson ... A.E Martin
14 Jun 2013
Acta psychologica | VOL. 144

Comparison of nonlinear and spline regression models for describing mule duck growth curves
Z.G Vitezica ... C Robert-Granie
Poultry science | VOL. 89
Z.G Vitezica, et. al.Z.G Vitezica ... C Robert-Granie
01 Aug 2010
Poultry science | VOL. 89

Impact of new technologies on stress, attrition and well-being in emergency call centers: the NextGeneration 9\u20131-1 study protocol
Janet Baseman ... Ian Painter
BMC International Health and Human Rights | VOL. 18
Janet Baseman, et. al.Janet Baseman ... Ian Painter
04 May 2018
BMC International Health and Human Rights | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Influence.ME: Tools for Detecting Influential Data in Mixed Effects Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The R journal