For testing the significance of regression coefficients, go ahead and log‐transform count data

Anthony R Ives

doi:10.1111/2041-210x.12386

Abstract

Summary The rise in the use of statistical models for non‐Gaussian data, such as generalized linear models (GLMs) and generalized linear mixed models (GLMMs), is pushing aside the traditional approach of transforming data and applying least‐squares linear models (LMs). Nonetheless, many least‐squares statistical tests depend on the variance of the sum of residuals, which by the Central Limit Theorem converge to a Gaussian distribution for large sample sizes. Therefore, least‐squares LMs will likely have good performance in assessing the statistical significance of regression coefficients. Using simulations of count data, I compared GLM approaches for testing whether regression coefficients differ from zero with the traditional approach of applying LMs to transformed data. Simulations assumed that variation among sample populations was either (i) negative binomial or (ii) log‐normal Poisson (i.e. log‐normal variation among populations that were then sampled by a Poisson distribution). I used the simulated data to conduct tests of the hypotheses that regression coefficients differed from zero; I did not investigate statistical properties of the coefficient estimators, such as bias and precision. For negative binomial simulations whose assumptions closely matched the GLMs, the GLMs were nonetheless prone to type I errors (false positives) especially when there was more than one predictor (independent) variable. After correcting for type I errors, however, the GLMs provided slightly better statistical power than LMs. For log‐normal‐Poisson simulations, both a GLMM and the LMs performed well, but under some simulated conditions the GLMs had high type I error rates, a deadly sin for statistical tests. These results show that, while GLMs have slight advantages in power when they are properly specified, they can lead to badly wrong conclusions about the significance of regression coefficients if they are mis‐specified. In contrast, transforming data and applying least‐squares linear analyses provide robust statistical tests for significance over a wide range of conditions. Thus, the traditional approach of transforming data and applying LMs is still useful.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

For testing the significance of regression coefficients, go ahead and log‐transform count data

Abstract

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution

Lead the way for us

Journal: Methods in Ecology and Evolution	Publication Date: May 6, 2015
Citations: 177

Similar Papers

GLMM and GAMM
Alain F Zuur ... Neil J Walker
-
Alain F Zuur, et. al.Alain F Zuur ... Neil J Walker
01 Jan 2009
01 Jan 2009

Estimating the spatial pattern of human-caused forest fires using a generalized linear mixed model with spatial autocorrelation in South Korea
Hanbin Kwak ... Su-Na Kim
International Journal of Geographical Information Science | VOL. 26
Hanbin Kwak, et. al.Hanbin Kwak ... Su-Na Kim
01 Sep 2012
International Journal of Geographical Information Science | VOL. 26

Modelling Count Responses with Overdispersion
Kwang Mo Jeong
Communications for Statistical Applications and Methods | VOL. 19
Kwang Mo JeongKwang Mo Jeong
30 Nov 2012
Communications for Statistical Applications and Methods | VOL. 19

Chapter 3 - Linear Models, Generalized Linear Models (GLMs), and Random Effects Models: The Components of Hierarchical Models
Marc Kéry ... J Andrew Royle
Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS | VOL. -
Marc Kéry, et. al.Marc Kéry ... J Andrew Royle
04 Dec 2015
Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

For testing the significance of regression coefficients, go ahead and log‐transform count data

Abstract

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution