Abstract

Scientists in biomedical and psychosocial research need to deal with skewed data all the time. In the case of comparing means from two groups, the log transformation is commonly used as a traditional technique to normalize skewed data before utilizing the two-group t-test. An alternative method that does not assume normality is the generalized linear model (GLM) combined with an appropriate link function. In this work, the two techniques are compared using Monte Carlo simulations; each consists of many iterations that simulate two groups of skewed data for three different sampling distributions: gamma, exponential, and beta. Afterward, both methods are compared regarding Type I error rates, power rates and the estimates of the mean differences. We conclude that the t-test with log transformation had superior performance over the GLM method for any data that are not normal and follow beta or gamma distributions. Alternatively, for exponentially distributed data, the GLM method had superior performance over the t-test with log transformation.

Highlights

  • In the biosciences, with the escalating numbers of studies involving many variables and subjects, there is a belief between non-biostatistician scientists that the amount of data will reveal all there is to understand from it

  • We study skewed data from three different sampling distributions to test the difference between two-group means

  • Comparisons were made between log-transformed t-tested data the andapplication original generalized linear model (GLM)-fitted transformation to make sure that targeted variables were not normal before transformation and regarding the following aspects

Read more

Summary

Introduction

With the escalating numbers of studies involving many variables and subjects, there is a belief between non-biostatistician scientists that the amount of data will reveal all there is to understand from it. Data analysis can be significantly simplified when the variable of interest has a symmetric distribution (preferably normal distribution) across subjects, but usually, this is not the case. The Monte Carlo simulation is used to investigate this matter in the case of comparing means from two groups

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call