Stepwise Logistic Regression

Nathan L Pace,William M Briggs

doi:10.1213/ane.0b013e3181a7b52d

Abstract

In Response: We thank Dr. Arunajadai for his comments about the statistical simulations in our editorial (text NLP, algorithm WMB) demonstrating the perils of stepwise logistic regression.1 This allows us to clarify an ambiguity in the nomenclature of the stepwise automatic variable selection algorithm. Correctly specified, the algorithm should be described as either stepwise forward selection, stepwise backward elimination, or stepwise with forward selection and/or backward elimination; however, the word stepwise itself is also commonly used to refer to any of the three variants or to just the third variant. Arunajadai2 has correctly stated that our particular simulations used the stepwise backward elimination variant. Our simulations used randomly created covariates to demonstrate how commonly there was the creation of spurious associations by stepwise modeling (backward elimination variant). Dr. Arunajadai has also provided R software code to perform the other two variants; he reports that there were no spurious associations with no covariate significant at P < 0.05 using either the forward selection or the forward selection/ backward elimination variants. In his code, Arunajadai estimates a mean intercept model object, i.e., “fit <- glm(y ∼ 1, data = w, family = binomial),” for submission to the stepwise function. The submission of a mean intercept model to the stepwise process cannot identify any association, true or spurious. When a full (all covariates) model, i.e., “fit <- glm(y ∼., data = w, family = binomial)” is used, all three variants have qualitatively the same results of numerous spurious associations (appendix available at www.anesthesia-analgesia.org). The inclusion of noise variables during stepwise modeling regardless of the variant has been demonstrated elsewhere.3–5 Dr. Arunajadai also raised the very interesting question of which information criterion should be used at each step for adding or removing a covariate; he advocates the Bayesian Information Criterion (BIC) in contrast to the Akaike Information Criterion (AIC) used in our simulation. Both the AIC and the BIC are indexes in which twice the negative maximized log likelihood of the model fit is penalized by subtracting either twice the number of model parameters (AIC) or the number of model parameters multiplied by the log of the sample size (BIC). Of the candidate models possible, the model with the higher AIC or higher BIC is favored. As Arunajadai noted, the BIC is more heavily penalized and will produce more parsimonious models (fewer significant covariates). However, there is a competition in choosing between AIC and BIC; the AIC will yield optimal regression estimation while the BIC represents consistent model identification rules. It is not possible to create models with the properties favored by both the AIC and the BIC.6 Using the BIC index in our simulation still produces spurious associations. Automatic variable selection via a stepwise process is a hazardous undertaking. As J. B. Copas3 humorously noted, “If you torture the data for long enough, in the end they will confess …. What more brutal torture can there be than subset selection? The data will always confess, and the confession will usually be wrong.” Nathan L. Pace, MD, MStat Department of Anesthesiology University of Utah Salt Lake City, Utah William M. Briggs, PhD Department of Emergency Medicine New York Methodist Hospital Brooklyn, New York [email protected]

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stepwise Logistic Regression

Abstract

Talk to us

Similar Papers

More From: Anesthesia & Analgesia

Lead the way for us

Journal: Anesthesia & Analgesia	Publication Date: Jul 1, 2009
Citations: 2

Similar Papers

Performance of Akaike Information Criterion and Bayesian Information Criterion in Selecting Partition Models and Mixture Models.
Qin Liu ... Shane A Richards
Systematic Biology | VOL. 72
Qin Liu, et. al.Qin Liu ... Shane A Richards
28 Dec 2022
Systematic Biology | VOL. 72

Comparison of Akaike information criterion (AIC) and Bayesian information criterion (BIC) in selection of an asymmetric price relationship

Journal of Development and Agricultural Economics | VOL. 2

31 Jan 2010
Journal of Development and Agricultural Economics | VOL. 2

Assessing individual heterogeneity using model selection criteria: how many mixture components in capture–recapture models?
Sarah Cubaynes ... Christian Lavergne
Methods in Ecology and Evolution | VOL. 3
Sarah Cubaynes, et. al.Sarah Cubaynes ... Christian Lavergne
23 Jan 2012
Methods in Ecology and Evolution | VOL. 3

Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC).
Scott I Vrieze
Psychological Methods | VOL. 17
Scott I VriezeScott I Vrieze
01 Jan 2012
Psychological Methods | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stepwise Logistic Regression

Abstract

Talk to us

Similar Papers

More From: Anesthesia &amp; Analgesia

More From: Anesthesia & Analgesia