Abstract

BackgroundAlthough much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals.MethodsArticles published in 2004, 2006, and 2008 in the Chinese Journal of Epidemiology and the Chinese Journal of Preventive Medicine were reviewed. Five categories of methods were identified whereby variables were selected using: A - bivariate analyses; B - multivariable analysis; e.g. stepwise or individual significance testing of model coefficients; C - first bivariate analyses, followed by multivariable analysis; D - bivariate analyses or multivariable analysis; and E - other criteria like prior knowledge or personal judgment.ResultsAmong the 287 articles that reported using variable selection methods, 6%, 26%, 30%, 21%, and 17% were in categories A through E, respectively. One hundred sixty-three studies selected variables using bivariate analyses, 80% (130/163) via multiple significance testing at the 5% alpha-level. Of the 219 multivariable analyses, 97 (44%) used stepwise procedures, 89 (41%) tested individual regression coefficients, but 33 (15%) did not mention how variables were selected. Sixty percent (58/97) of the stepwise routines also did not specify the algorithm and/or significance levels.ConclusionsThe variable selection methods reported in the two journals were limited in variety, and details were often missing. Many studies still relied on problematic techniques like stepwise procedures and/or multiple testing of bivariate associations at the 0.05 alpha-level. These deficiencies should be rectified to safeguard the scientific validity of articles published in Chinese epidemiology journals.

Highlights

  • Much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies

  • The variable selection methods identified in the articles were classified into five mutually exclusive categories: Category A - methods that selected variables based only on bivariate associations; Category B - methods that selected variables based only on their performance in a multivariable regression model; Category C - methods that first screened variables based on their bivariate associations, and selected those screened-in variables according to their performance in a multivariable regression model; Category D - methods that selected variables based on their bivariate associations or their performance in a multivariable regression model; Category E - methods that selected variables using other criteria; e.g. prior knowledge or personal judgment, and tree models

  • There was a greater proportion of articles in the Chinese Journal of Epidemiology (231/ 1199 = 19%) that reported using variable selection methods compared with the Chinese Journal of Preventive Medicine (56/683 = 8%), but there were no substantial differences in the proportion of articles using variable selection methods between 2004, 2006, and 2008

Read more

Summary

Introduction

Much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals. Selecting the appropriate variables for an analytical model is an important task in epidemiological research. This may involve finding the right combination of confounders to adjust for when estimating the association between an exposure variable and the disease outcome, obtaining a parsimonious set of prognostic variables in the construction of a screening instrument or predictive tool, or determining independent predictors for a clinical outcome in order to guide future research hypotheses. One prior study provided an assessment of general statistical analyses in five Chinese medical journals [18], but this investigation is the first to document the variable selection methods that are reported in Chinese epidemiology journals

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.