Abstract
Traditional methods to deal with non-linearity in regression analysis often result in loss of information or compromised interpretability of the results. A recommended but underutilized method for modeling non-linear associations in regression models is spline functions. We explain spline functions in a non-mathematical way and illustrate the application and interpretation to an empirical data example. Using data from the Amsterdam Growth and Health Longitudinal Study, we examined the non-linear relationship between the sum of four skinfolds and VO2max, which are measures of body fat and cardiorespiratory fitness, respectively. We compared traditional methods (i.e., quadratic regression and categorization) to spline methods [1- and 3-knot linear spline (LSP) models and a 3-knot restricted cubic spline (RCS) model] in terms of the interpretability of the results and their explained variance (). The spline models fitted the data better than the traditional methods. Increasing the number of knots in the LSP model increased the explained variance (from for the 1-knot model to for the 3-knot model). The RCS model fitted the data best (), but results in regression coefficients that are harder to interpret. Spline functions should be considered more often as they are flexible and can be applied in commonly used regression analysis. RCS regression is generally recommended for prediction research (i.e., to obtain the predicted outcome for a specific exposure value), whereas LSP regression is recommended if one is interested in the effects in a population.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have