Abstract
To the Editor: We appreciate the opportunity to respond to several issues raised in the editorial by Dr. Pace1 regarding selection of predictor variables and the validity of statistical models using stepwise regression methods, used in our article in this issue of the journal.2 We agree with Dr. Pace that our statistical models are hypothesis generating and not a mechanistic or pathophysiologic explanation of the effects of the covariates on outcome. Logistic regression analysis can only provide associations between variables, and not cause and effect. In his simulations using 1000 or 100,000 rows of data, 50 random covariates and 1 random outcome measure, he found approximately 2–3 independent predictors per simulation that were spuriously statistically significant at the P < 0.05 level. With 50 random covariates and a statistical test at the 0.05 level, we would expect to see about 50 × 0.05 = 2.5 predictors that are spuriously statistically significant. His simulations have confirmed the expected false positive rate of 0.05. In the development of the Veterans Affairs National Surgical Quality Improvement Program (NSQIP), we did extensive testing of our statistical models using split-sample and bootstrap methods, as recommended by Dr. Pace and others. Some of the results of this testing are published in our primary manuscripts on the predictors of postoperative mortality3 and morbidity.4 Using the split-sample approach, we found that the c-index for predicting mortality ranged between 0.773 and 0.923 for the different surgical subspecialties, and that the degradation of the c-index between a randomly selected learning and test dataset averaged between 0.003 and 0.109. We concluded that these were very acceptable c-indexes, which did not degrade very much in split-sample testing. Using bootstrap methods, we found that many of the most important predictor variables in our models (those that entered in the first 5–10 steps in the stepwise regression) were very stable, and entered into most of the models developed. We agree that there is instability in the predictor variables that enter in the later stages of a stepwise regression analysis. We have also found that the most important predictor variables of postoperative mortality and morbidity have remained stable from 1 yr to the next in the NSQIP.5 We have chosen not to replicate these split-sample and bootstrap analyses in other articles using NSQIP data for prediction modeling, as were used in our article.2 Stepwise logistic regression was just a small part of the analyses in our article.2 The purpose of Table 8 in this article was just to show and contrast some of the important predictor variables for 24-h and 30-day postoperative death. We believe that the first 5–10 variables in each model have validity and stability, and we also agree that some of the variables that enter after the top 5–10 may have less validity and stability. It is also important to inspect the size of the reported odds ratios; for example, the odds ratios for aortic surgery and dyspnea at rest are particularly high for predicting 24-h death and are probably important factors. In summary, we agree that Dr. Pace has raised important issues regarding the use and interpretation of stepwise regression methods, which we have addressed in other articles from the NSQIP. Michael J. Bishop, MD William G. Henderson, PhD Karen B. Domino, MD, MPH [email protected]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.