Abstract

Purpose : The statistical terms 'correlation' and 'regression' are frequently mistaken for each other in the scientific literature. Why this is so is unclear. This paper discusses their differences/similarities arguing that in most circumstances regression is the most appropriate technique to use, since regression incorporates a notion of dependency of one variable on another. Method : Pearson's correlation coefficient (r) is introduced as a method for estimating the degree of linear association between two normally distributed variables. The problem of 'least squares' regression (when y depends on x) is introduced by considering the best-fitting straight line between points on a scatter plot. Results : Correlation, regression analysis and residual estimation are discussed by taking examples from the author's own teaching experiences. Conclusions : Correlation and regression share some similarities. However, regression is the better technique to use because with it comes a notion of dependency of one variable upon another. Regression model checking includes residual examination. The importance of plotting and examination of residuals cannot be overemphasized. Residual examination should become as much a part of a regression analysis as the estimation of the regression coefficients themselves.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call